docs: streamline 3 READMEs and fix EN language switcher (#6)

Slim overview across EN/zh/JA, drop badges/blockquote/contributing, trim LICENSE header; fix the English switcher to point at the repo homepage instead of the GitHub site root.
2026-06-02 15:33:08 +08:00
parent 6da87df73a
commit cffe4da4ca
4 changed files with 69 additions and 148 deletions
@@ -1,5 +1,3 @@
 InfiPlot Copyright (C) 2025-2026 InfiPlot Contributors
                    GNU AFFERO GENERAL PUBLIC LICENSE
                       Version 3, 19 November 2007
@@ -1,21 +1,8 @@
-[English](README.md) · [简体中文](README.zh-CN.md) · 日本語
+[English](https://github.com/zonghaoyuan/infiplot "Back to homepage") · [简体中文](README.zh-CN.md) · 日本語
-# InfiPlot
+# ⚡ 概要
-> AI がリアルタイムに生成する、初のインタラクティブ・ストーリーゲーム —— あなたが思い描く場面を入力すれば、それが目の前に、没入感たっぷりのビジュアルとして立ち上がり、あなた自身がその中へ飛び込めます。一幕一幕の筋書きも、一枚一枚の画像も、一人ひとりのキャラクターも、すべてマルチモーダル AI エンジンがその場で設計・生成します —— 幼い頃に夢見た「アニメの中に飛び込む」あの願いを叶えるために。
+InfiPlot は、AI がコンテンツをリアルタイムに生成するインタラクティブ・ストーリーゲームです。あらかじめ用意された筋書きもキャラクターもなく、すべてがあなたの求めに応じてその場で生成されます。
 [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE)
 [![GitHub stars](https://img.shields.io/github/stars/zonghaoyuan/infiplot?style=social)](https://github.com/zonghaoyuan/infiplot/stargazers)
 **▶ ベータ期間中は無料でプレイ、セットアップ不要 —— [infiplot.com](https://infiplot.com)**
 このプロジェクトを面白いと思っていただけたら、リポジトリへの ⭐ が私たちにとって何よりの励みになり、より多くの人に届く助けにもなります。ありがとうございます！
 ---
 ## InfiPlot とは
 InfiPlot は、AI がコンテンツをリアルタイムに生成する世界初のインタラクティブ・ストーリーゲームです。あらかじめ用意された筋書きもキャラクターも、声色さえもありません。すべてはあなたの求めに応じて、その場でカスタマイズして生成されます。私たちは、美少女ゲーム（galgame）、乙女ゲーム、フルモーション・ビデオ（FMV）ゲームに比肩する体験を目指しつつ、一人ひとりに合わせた*参加型のファンタジー*を提供します —— より没入感のある視聴体験で、あなたの想像力と好奇心を存分に満たします。
 ひとことで言えば、私たちが作っているのは、AI がリアルタイムにコンテンツを生成する『Love Is All Around（完蛋！我被美女包围了！）』です。
@@ -25,7 +12,13 @@ InfiPlot は、AI がコンテンツをリアルタイムに生成する世界
 ---
-## 開発の動機
+## 🌐 ライブデモ
 無料でプレイ、セットアップ不要：[infiplot.com](https://infiplot.com)
 ---
 ## チームとビジョン
 私たちは、清華大学をはじめとする大学に集う若者のグループです。
@@ -33,7 +26,8 @@ InfiPlot は、AI がコンテンツをリアルタイムに生成する世界
 もう一方で、私たちはたまたま大規模モデルの技術を少しばかり理解しており、AI でアイデアを素早く形にでき、技術の道筋や既存技術で実現できる製品の限界について、ささやかな考えを持っていました。
-きっかけは 2026 年 4 月 22 日、[@zan2434](https://x.com/zan2434) たちが [flipbook](https://flipbook.page/) を公開したことでした。この全く新しいインタラクションの形に、私たちは驚き、心を躍らせました。そして 5 月のある日、意気投合し、こうした製品を作ろうと決めました —— かつて諦めた幻想を叶える手助けをしつつ、マルチモーダルモデルがもたらす新しいインタラクションの形を探るために。
+きっかけは 2026 年 4 月 22 日、[@zan2434](https://x.com/zan2434) たちが [flipbook](https://flipbook.page/) を公開したことでした。この全く新しいインタラクションの形に、私たちは驚き、心を躍らせました。
 そして 5 月のある日、意気投合し、こうした製品を作ろうと決めました —— かつて諦めた幻想を叶える手助けをしつつ、マルチモーダルモデルがもたらす新しいインタラクションの形を探るために。
 プロジェクトはまだごく初期で、多くの機能が未完成です。[issue](https://github.com/zonghaoyuan/infiplot/issues) でのフィードバックを歓迎します。あるいは開発チームに加わって、一緒に新たな可能性を探り、あなた自身の好奇心を満たしてください。
@@ -47,7 +41,7 @@ InfiPlot は、AI がコンテンツをリアルタイムに生成する世界
 一回のプレイ全体を、私たちは**ストーリー（story）**と呼んでいます。
-物語は一連の**シーン（scene）**として展開します。各シーンは、AI が描いた 1 枚の背景画と、短い**ビート（beat）**のツリー —— ナレーション、セリフ、ときおりの選択肢 —— で構成されます。シーン内のビートをタップしていく間、画像はそのまま動きません。選択肢が本当に新しい場所 —— 別の空間、新しい視点、時間の跳躍 —— へ導いたときだけ、AI は次のシーンを描きます。
+物語は一連のシーン（scene）として展開します。各シーンは、AI が描いた 1 枚の背景画と、短いビート（beat）のツリー —— ナレーション、セリフ、ときおりの選択肢 —— で構成されます。シーン内のビートをタップしていく間、画像はそのまま動きません。選択肢が本当に新しい場所 —— 別の空間、新しい視点、時間の跳躍 —— へ導いたときだけ、AI は次のシーンを描きます。
 ```mermaid
 flowchart TD
@@ -64,9 +58,9 @@ flowchart TD
    SC -. 次のシーンを先回り生成 .-> W
 ```
-あなたがひとつのシーンを読んでいる間に、エンジンは**選択肢が導きうるシーンを先回りして生成**します —— 避けられない次の一歩については、そのさらに先のシーンまで。あなたが方向を選ぶ頃には、その画像はたいてい描き上がっているので、切り替えは一瞬に感じられます。いまはまだ多少の遅延を感じるかもしれませんが、ご安心ください —— 私たちは鋭意改善に取り組んでいます。
+あなたがひとつのシーンを読んでいる間に、エンジンは選択肢が導きうるシーンを先回りして生成します —— 避けられない次の一歩については、そのさらに先のシーンまで。あなたが方向を選ぶ頃には、その画像はたいてい描き上がっているので、切り替えは一瞬に感じられます。いまはまだ多少の遅延を感じるかもしれませんが、ご安心ください —— 私たちは鋭意改善に取り組んでいます。
-ボタンではなく背景そのものをクリックすると、**ビジョン（vision）**モデルを経由します。タップした位置を読み取り、いまのシーンを探索しているのか（新しい画像なしでビートを挿入）、先へ進もうとしているのか（新しいシーン）を判断します。これは flipbook から学んだ貴重な知見に基づくもので、この機能はいずれ InfiPlot を特徴づける鍵となり、プレイ体験をもう一段引き上げてくれると信じています。
+ボタンではなく背景そのものをクリックすると、ビジョン（vision）モデルを経由します。タップした位置を読み取り、いまのシーンを探索しているのか（新しい画像なしでビートを挿入）、先へ進もうとしているのか（新しいシーン）を判断します。これは flipbook から学んだ貴重な知見に基づくもので、この機能はいずれ InfiPlot を特徴づける鍵となり、プレイ体験をもう一段引き上げてくれると信じています。
 アートの中には、従来型のゲーム UI は一切焼き込まれていません。AI は、あなたが選んだ任意のスタイル —— 「方眼紙の棒人間」でも「サイバーパンク・ノワール」でも —— で世界を描きます。セリフ枠と選択肢ボタンは、その上に重ねた軽量な HTML レイヤーで、シーンになじむよう調整されています。つまり UI は、毎回同じではなく、そのプレイの物語に寄り添って変化するのです。
@@ -82,7 +76,7 @@ flowchart TD
 ## 設定ガイド
-InfiPlot は 4 種類のモデルプロバイダと通信します。**テキスト（Text）・ビジョン（Vision）は、任意の OpenAI 互換エンドポイント**（OpenAI、OpenAI 互換プロキシ経由の Anthropic、Gemini、OpenRouter、DeepSeek、ローカルの Ollama など）を使用でき、自由に組み合わせられます。**画像（Image）**は現在 **Runware**（OpenAI 互換ではなく、独自の task-array プロトコル）を使用します —— レイテンシとコストを総合的に考慮した選択です。**音声（TTS）**は **Xiaomi MiMo** の独自音声デザイン/クローンプロトコルを使用します（これも OpenAI 互換ではありません）—— キャラクターごとの音声デザイン、クローン、行ごとの抑揚指示に対応します。
+InfiPlot は 4 種類のモデルプロバイダと通信します。**テキスト（Text）・ビジョン（Vision）は、任意の OpenAI 互換エンドポイント**を使用でき、自由に組み合わせられます。**画像（Image）**は現在 **Runware**（OpenAI 互換ではなく、独自の task-array プロトコル）を使用します。**音声（TTS）**は **Xiaomi MiMo** の独自音声デザイン/クローンプロトコルを使用します —— キャラクターごとの音声デザイン、クローン、行ごとの抑揚指示に対応します。
 **1. プロバイダを選ぶ**
@@ -91,7 +85,7 @@ InfiPlot は 4 種類のモデルプロバイダと通信します。**テキス
 | Text · ストーリー監督  | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL`        | ✅ | DeepSeek の `deepseek-v4-flash` |
 | Image · シーン描画  | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL`     | ✅ | [Runware](https://runware.ai) の `runware:400@6`（FLUX.2 [klein] 9B KV） |
 | Vision · クリック解釈  | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL`  | ✅ | Google の `gemini-3.5-flash` |
-| TTS · キャラクター音声 | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | 任意 —— 空欄なら無音で動作 | Xiaomi MiMo の `mimo-v2.5-tts`（独自プロトコル、OpenAI 互換ではない） |
+| TTS · キャラクター音声 | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | 任意 —— 空欄なら無音で動作 | Xiaomi MiMo の `mimo-v2.5-tts` |
 **2. 環境変数を設定する**
@@ -105,44 +99,24 @@ Vercel プロジェクト（**Settings → Environment Variables**）、また
 **3. コストに注意**
-推奨の 3 点セットでは、各**シーン**のコストの大半はテキスト LLM の呼び出しです。FLUX.2 [klein] 9B KV の画像は 1 シーンあたり概ね **\$0.00078**（1792×1024、4 ステップ、サブ秒）で、残りはテキスト呼び出しです（deepseek-v4-flash を使えば非常に安価）。シーン内のビートをタップしていくのは無料です。切り替えを一瞬に保つため、エンジンは**選ぶ可能性はあるが実際には選ばないシーンも先行生成**します —— そのため実際の支出は、あなたが実際に見るシーン数よりやや高くなります。標準ではレート制限も認証もありません —— デプロイを公開すれば、請求額にそのまま反映されます。広く共有する前に、制限を追加し（必要に応じてプリフェッチの深さを下げ）てください。
+推奨の 3 点セットでは、各シーンのコストは主に画像生成モデルによるものです。FLUX.2 [klein] 9B KV の画像は 1 シーンあたり概ね **$0.00078**（1792×1024、4 ステップ、サブ秒）。テキストモデルは `deepseek-v4-flash` を使用するため、テキストコストは比較になりません。シーン内のビートをタップしていくのは無料です。切り替えを一瞬に保つため、エンジンは選ぶ可能性はあるが最終的に選ばないシーンも先行生成します —— そのため実際の支出は、あなたが実際に見るシーン数よりやや高くなります。
 ---
-## ロードマップ
+## Roadmap
- [ ] 知覚できないほどの低遅延
+- [ ] 生成遅延を体感できないレベルまで下げる
 - [ ] より多くのモデルプロバイダに対応
 - [ ] プレイ中の自由入力対応
 - [ ] モバイルブラウザ対応
 - [ ] 大半のプロバイダに対応
 - [ ] ユーザー登録・ログイン機能
 - [ ] 静止画から動画へのアップグレード
 - [ ] プレイ中の自由入力対応
 - [ ] 音声インタラクション
- [ ] ストーリー共有機能
+- [ ] プレイ中のストーリーを共有
 - [ ] モバイルアプリ
 ---
 ## コントリビュート
 Issue と Pull Request を歓迎します。InfiPlot をローカルで実行するには（Node 20+ と pnpm 9+ が必要）：
 ```bash
 pnpm install
 cp apps/web/.env.example apps/web/.env.local   # キーを記入 —— 設定ガイドを参照
 pnpm dev                                        # http://localhost:3000 を開く
 ```
 コントリビュートすることで、あなたの貢献が AGPL-3.0 の下でライセンスされることに同意したものとみなされます。
 ---
 ## スター推移
 [![Star History Chart](https://api.star-history.com/svg?repos=zonghaoyuan/infiplot&type=Date)](https://star-history.com/#zonghaoyuan/infiplot&Date)
 ---
 ## ライセンス
 [AGPL-3.0](LICENSE) © InfiPlot。コアは完全にオープンソースです。AGPL の「ネットワーク利用は配布とみなす」条項により、改変版をネットワークサービスとして運用する者は、そのソースコードも公開しなければなりません —— これによりコアをオープンに保ちつつ、将来のホスティング版や商用版の余地も残しています。
@@ -1,21 +1,8 @@
-English · [简体中文](README.zh-CN.md) · [日本語](README.ja.md)
+[English](https://github.com/zonghaoyuan/infiplot) · [简体中文](README.zh-CN.md) · [日本語](README.ja.md)
-# InfiPlot
+# ⚡ Overview
-> The first real-time, AI-generated interactive story game — describe the scene you've always fantasized about and watch it come alive in front of you: immersive, visual, and yours to step into. Every plot beat, every image, every character is designed and generated on the fly by a multimodal AI engine — all to grant that childhood wish of crossing over into the cartoon.
+InfiPlot is an interactive story game with content generated by AI in real time. There are no pre-written plots and no pre-made characters — everything is generated on demand, tailored to you.
 [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE)
 [![GitHub stars](https://img.shields.io/github/stars/zonghaoyuan/infiplot?style=social)](https://github.com/zonghaoyuan/infiplot/stargazers)
 **▶ Free to play during the beta, no setup required — [infiplot.com](https://infiplot.com)**
 If you find this project interesting, a free ⭐ on the repo means the world to us — and it genuinely helps more people discover it. Thank you!
 ---
 ## What is InfiPlot
 InfiPlot is the world's first interactive story game with content generated by AI in real time. There are no pre-written plots, no pre-made characters, not even pre-recorded voices — everything is generated on demand, tailored to you. We're aiming for an experience on par with bishōjo games (galgame), otome games, and full-motion-video (FMV) games, while adding a personal, *participatory fantasy* — a richer, more immersive audio-visual experience that gives your imagination and curiosity free rein.
 In one line: what we're building is an AI-generated, real-time take on *Love Is All Around* (《完蛋！我被美女包围了！》).
@@ -25,7 +12,13 @@ Learn magic in the world of Harry Potter; become the one everyone at school ador
 ---
-## Why we built it
+## 🌐 Live Demo
 Free to play, no setup required: [infiplot.com](https://infiplot.com)
 ---
 ## Team & Vision
 We're a group of young people from Tsinghua University and other schools.
@@ -33,7 +26,9 @@ On one hand, we're longtime, devoted players of galgames, otome games, FMV, and
 On the other hand, we happen to know a little about large-model technology: enough to turn ideas into working software quickly with AI, and to have formed some modest views on the technical paths available and the limits of what today's tech can build.
-The spark came on April 22, 2026, when [@zan2434](https://x.com/zan2434) and others released [flipbook](https://flipbook.page/). We were stunned and delighted by this entirely new form of interaction. So one day in May, we agreed on the spot to build something like this — both to help people live out the fantasies they'd once set aside, and to explore the new modes of interaction that multimodal models make possible.
+The spark came on April 22, 2026, when [@zan2434](https://x.com/zan2434) and others released [flipbook](https://flipbook.page/). We were stunned and delighted by this entirely new form of interaction.
 So one day in May, we agreed on the spot to build something like this — both to help people live out the fantasies they'd once set aside, and to explore the new modes of interaction that multimodal models make possible.
 The project is still very early and many features are far from polished. We'd love your feedback — open an [issue](https://github.com/zonghaoyuan/infiplot/issues), or join our dev team and explore the new possibilities with us, and satisfy your own curiosity.
@@ -47,7 +42,7 @@ Built on text, image, and audio models, we've assembled a multi-agent framework
 We call each complete playthrough a **story**.
-A story unfolds as a sequence of **scenes**. Each scene is one AI-painted background plus a short tree of **beats** — moments of narration, dialogue, and the occasional choice. You tap through a scene's beats and the image stays put; only when a choice leads somewhere genuinely new — another place, a new point of view, a jump in time — does the AI paint the next scene.
+A story unfolds as a sequence of scenes. Each scene is one AI-painted background plus a short tree of beats — moments of narration, dialogue, and the occasional choice. You tap through a scene's beats and the image stays put; only when a choice leads somewhere genuinely new — another place, a new point of view, a jump in time — does the AI paint the next scene.
 ```mermaid
 flowchart TD
@@ -64,9 +59,9 @@ flowchart TD
    SC -. speculatively pre-generate the next scene .-> W
 ```
-While you're reading one scene, the engine **speculatively generates the scenes your choices could lead to** — and, for unavoidable next steps, the scene after that. By the time you pick a direction, its image is usually already painted, so the cut feels instant. If you still notice some lag today, don't worry — we're working hard to bring it down.
+While you're reading one scene, the engine speculatively generates the scenes your choices could lead to — and, for unavoidable next steps, the scene after that. By the time you pick a direction, its image is usually already painted, so the cut feels instant. If you still notice some lag today, don't worry — we're working hard to bring it down.
-Clicking the background itself (not a button) routes through a **vision** model: it reads where you tapped and decides whether you're exploring the current scene (it inserts a beat — no new image) or moving on (a new scene). This builds on a valuable lesson we learned from flipbook, and we believe it will become one of InfiPlot's defining features — taking the experience to the next level.
+Clicking the background itself (not a button) routes through a vision model: it reads where you tapped and decides whether you're exploring the current scene (it inserts a beat — no new image) or moving on (a new scene). This builds on a valuable lesson we learned from flipbook, and we believe it will become one of InfiPlot's defining features — taking the experience to the next level.
 There is no traditional game UI baked into the art. The AI paints the world in whatever style you pick — "stick figure on grid paper" or "cyberpunk noir" — and the dialogue panel and choice buttons are a light HTML layer drawn on top, tuned to sit over the scene. In other words, the UI fits the story of each playthrough, rather than staying the same every time.
@@ -82,7 +77,7 @@ After deploy, set your environment variables in the Vercel project — see the [
 ## Configuration guide
-InfiPlot talks to four kinds of model providers. **Text and Vision use any OpenAI-compatible endpoint** (OpenAI, Anthropic via an OpenAI-compat proxy, Gemini, OpenRouter, DeepSeek, local Ollama, …), so you can mix and match freely. **Image** currently goes to **Runware** (its own task-array protocol, not OpenAI-compatible), chosen for its combination of latency and cost. **TTS** uses **Xiaomi MiMo**'s own voice design / clone protocol (also not OpenAI-compatible) — per-character voice design, clone, and per-line delivery direction.
+InfiPlot talks to four kinds of model providers. **Text and Vision use any OpenAI-compatible endpoint**, so you can mix and match freely. **Image** currently goes to **Runware** (its own task-array protocol, not OpenAI-compatible). **TTS** uses **Xiaomi MiMo**'s own voice design / clone protocol — per-character voice design, clone, and per-line delivery direction.
 **1. Choose your providers**
@@ -91,7 +86,7 @@ InfiPlot talks to four kinds of model providers. **Text and Vision use any OpenA
 | Text · story director  | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL`        | ✅ | `deepseek-v4-flash` via DeepSeek |
 | Image · scene renderer  | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL`     | ✅ | `runware:400@6` (FLUX.2 [klein] 9B KV) via [Runware](https://runware.ai) |
 | Vision · click reader  | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL`  | ✅ | `gemini-3.5-flash` via Google |
-| TTS · per-character voice | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | optional — leave blank to run silently | `mimo-v2.5-tts` via Xiaomi MiMo (own protocol, not OpenAI-compat) |
+| TTS · per-character voice | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | optional — leave blank to run silently | `mimo-v2.5-tts` via Xiaomi MiMo |
 **2. Set the environment variables**
@@ -105,44 +100,24 @@ See `apps/web/.env.example` for the exact shape.
 **3. Mind the cost**
-With the recommended trio, each **scene** is dominated by the text-LLM call. The FLUX.2 [klein] 9B KV image is roughly **\$0.00078** per scene (1792×1024, 4 steps, sub-second); the text call is the rest (very cheap with deepseek-v4-flash). Tapping through a scene's beats is free. To keep transitions instant, the engine also **pre-generates scenes you might pick but don't** — so real spend runs somewhat higher than the scenes you actually see. There is no rate limiting or auth out of the box — if you make your deployment public, your bill will reflect that. Add limits (and consider lowering the prefetch depth) before sharing widely.
+With the recommended trio, each scene's cost comes mainly from the image generation model. The FLUX.2 [klein] 9B KV image is roughly **\$0.00078** per scene (1792×1024, 4 steps, sub-second); the text model uses `deepseek-v4-flash`, so text costs are negligible by comparison. Tapping through a scene's beats is free. To keep transitions instant, the engine also pre-generates scenes you might pick but ultimately don't — so real spend runs somewhat higher than the scenes you actually see.
 ---
 ## Roadmap
- [ ] Latency you can't perceive
+- [ ] Make generation latency imperceptible
 - [ ] Compatibility with more model providers
 - [ ] Free-form player input mid-story
 - [ ] Mobile browser support
 - [ ] Compatibility with most providers
 - [ ] User accounts and login
 - [ ] Upgrade from static images to motion video
 - [ ] Free-form player input mid-story
 - [ ] Voice interaction
- [ ] Story sharing
+- [ ] Share the story you're playing
 - [ ] Mobile app
 ---
 ## Contributing
 Issues and pull requests are welcome. To run InfiPlot locally (Node 20+, pnpm 9+):
 ```bash
 pnpm install
 cp apps/web/.env.example apps/web/.env.local   # fill in your keys — see the Configuration guide
 pnpm dev                                        # open http://localhost:3000
 ```
 By contributing, you agree that your contributions are licensed under the AGPL-3.0.
 ---
 ## Star history
 [![Star History Chart](https://api.star-history.com/svg?repos=zonghaoyuan/infiplot&type=Date)](https://star-history.com/#zonghaoyuan/infiplot&Date)
 ---
 ## License
 [AGPL-3.0](LICENSE) © InfiPlot. The core is fully open source. AGPL's "network use is distribution" clause means anyone who runs a modified version as a network service must also publish their source — this keeps the core open while leaving room for a future hosted or commercial edition.
@@ -1,21 +1,8 @@
-[English](README.md) · 简体中文 · [日本語](README.ja.md)
+[English](https://github.com/zonghaoyuan/infiplot "Back to homepage") · 简体中文 · [日本語](README.ja.md)
-# InfiPlot
+# ⚡ 概览
-> 第一款 AI 实时生成的交互式剧情游戏 —— 输入你幻想中的场景，然后让它沉浸式、可视化地呈现在你面前，并且亲身参与其中。每一幕剧情、每一张图片、每一个角色，都由多模态AI引擎实时设计和生成，只为满足你儿时那个穿越进动画里的梦想。
+InfiPlot是一款AI实时生成内容的互动剧情游戏，这里没有预设好的剧情、角色，所有内容都根据你的需求定制化的生成。
 [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE)
 [![GitHub stars](https://img.shields.io/github/stars/zonghaoyuan/infiplot?style=social)](https://github.com/zonghaoyuan/infiplot/stargazers)
 **▶ 内测期间，免费在线试玩，无需本地部署 —— [infiplot.com](https://infiplot.com)**
 如果你觉得这个项目有意思，麻烦给仓库点一个免费的star ⭐ 这是对我们努力的最大肯定，也能实实在在地帮助更多人发现它。谢谢你！
 ---
 ## 项目介绍
 InfiPlot是世界上第一款实现了AI实时生成内容的互动剧情游戏，这里没有预设好的剧情、角色，甚至没有预设好的音色，所有内容都根据你的需求定制化的生成。我们力求实现比肩美少女游戏（galgame）、乙女游戏（Otome games）和全动态真人互动游戏（FMV）的体验，同时又能提供个性化的”参与式幻想”，让视听体验更沉浸，充分满足你的想象力和好奇心。
 用一句话说，我们要做的是一款用AI实时生成内容的《完蛋！我被美女包围了！》
@@ -25,7 +12,13 @@ InfiPlot是世界上第一款实现了AI实时生成内容的互动剧情游戏
 ---
-## 开发初衷
+## 🌐 在线体验
 免费在线试玩，无需本地部署：[infiplot.com](https://infiplot.com)
 ---
 ## 团队与愿景
 我们是一群来自清华大学等高校的年轻人。
@@ -33,7 +26,8 @@ InfiPlot是世界上第一款实现了AI实时生成内容的互动剧情游戏
 另一方面，我们恰好又对大模型技术有些了解，能用AI快速实现想法，对技术路线和基于已有技术的产品能力边界有一些浅薄的思考。
-契机发生在 2026 年 4 月 22 日，[@zan2434](https://x.com/zan2434) 等人发布了 [flipbook](https://flipbook.page/)，我们对这种全新的交互形态感到震惊和欣喜。于是在 5 月的某一天，我们一拍即合，决定做一款这样的产品，既帮助大家满足那些曾经遗憾过的幻想，又能够探索多模态模型所带来的新的交互形态。
+契机发生在 2026 年 4 月 22 日，[@zan2434](https://x.com/zan2434) 等人发布了 [flipbook](https://flipbook.page/)，我们对这种全新的交互形态感到震惊和欣喜。
 于是在 5 月的某一天，我们一拍即合，决定做一款这样的产品，既帮助大家满足那些曾经遗憾过的幻想，又能够探索多模态模型所带来的新的交互形态。
 目前我们的项目还很早期，有许多功能尚不完善，欢迎提交 [issues](https://github.com/zonghaoyuan/infiplot/issues) 反馈问题，或者加入我们的开发团队一起探索新的可能性，满足你的好奇心。
@@ -47,7 +41,7 @@ InfiPlot是世界上第一款实现了AI实时生成内容的互动剧情游戏
 我们把每一次游玩的整体体验称为故事（story）。
-故事以一连串**场景（scene）**的形式展开。每个场景由一张 AI 绘制的背景图，加上一棵简短的**节拍（beat）**树组成 —— 也就是旁白、对话和偶尔出现的选项。你逐拍点过一个场景时，画面始终不变；只有当某个选项把你带到真正全新的地方 —— 换了空间、换了视角、跳跃了时间 —— AI 才会绘制下一幕场景。
+故事以一连串场景（scene）的形式展开。每个场景由一张 AI 绘制的背景图，加上一棵简短的节拍（beat）树组成 —— 也就是旁白、对话和偶尔出现的选项。你逐拍点过一个场景时，画面始终不变；只有当某个选项把你带到真正全新的地方 —— 换了空间、换了视角、跳跃了时间 —— AI 才会绘制下一幕场景。
 ```mermaid
 flowchart TD
@@ -64,9 +58,9 @@ flowchart TD
    SC -. 预测式预生成下一幕 .-> W
 ```
-当你正在阅读一幕场景时，引擎会**预测式地生成你的选项可能通向的那些场景** —— 对于无法回避的下一步，还会再往前生成一幕。等你真正选定方向时，那一幕的图通常已经画好了，于是切换瞬间完成、毫无停顿。如果你现在仍然感到有些延迟，别担心，我们正在努力优化它。
+当你正在阅读一幕场景时，引擎会预测式地生成你的选项可能通向的那些场景 —— 对于无法回避的下一步，还会再往前生成一幕。等你真正选定方向时，那一幕的图通常已经画好了，于是切换瞬间完成、毫无停顿。如果你现在仍然感到有些延迟，别担心，我们正在努力优化它。
-直接点击背景本身（而非按钮）会走一个**视觉（vision）**模型：它读取你点击的位置，判断你是在探索当前场景（于是插入一个节拍 —— 不生成新图），还是要继续前进（生成一幕新场景）。这是基于我们从flipbook那里学到的宝贵认知，我们相信这个功能会在未来成为InfiPlot的关键功能，让你的游玩体验更上一层楼。
+直接点击背景本身（而非按钮）会走一个视觉（vision）模型：它读取你点击的位置，判断你是在探索当前场景（于是插入一个节拍 —— 不生成新图），还是要继续前进（生成一幕新场景）。这是基于我们从flipbook那里学到的宝贵认知，我们相信这个功能会在未来成为InfiPlot的关键功能，让你的游玩体验更上一层楼。
 未来，画面里将没有烤进任何传统的游戏 UI。AI 会用你选择的任意风格来描绘整个世界 —— 「方格纸上的火柴人」也好，「赛博朋克黑色电影」也罢 —— 而对话框和选项按钮，只是叠在画面之上、并为贴合场景而精心调校过的一层轻量 HTML。也就是说，每次游玩时，UI都会契合当前的故事，而不是一成不变。
@@ -82,7 +76,7 @@ flowchart TD
 ## 配置教程
-InfiPlot 会与四类模型供应商通信。**文本（Text）和视觉（Vision）都使用 OpenAI 兼容的接口**（OpenAI、通过 OpenAI 兼容代理的 Anthropic、Gemini、OpenRouter、DeepSeek、本地 Ollama……），可以自由搭配。**图像（Image）**目前接入 **Runware**（其自有的 task-array 协议，并非 OpenAI 兼容），因为延迟和成本的叠加考量。**语音（TTS）**使用**小米 MiMo** 自有的音色设计/克隆协议（同样不是 OpenAI 兼容）——支持角色级音色设计、克隆与逐行演绎指导。
+InfiPlot 会与四类模型供应商通信。**文本（Text）和视觉（Vision）都使用 OpenAI 兼容的接口**，可以自由搭配。**图像（Image）**目前接入 **Runware**（其自有的 task-array 协议，并非 OpenAI 兼容）。**语音（TTS）**使用**小米 MiMo** 自有的音色设计/克隆协议——支持角色级音色设计、克隆与逐行演绎指导。
 **1. 选择你的供应商**
@@ -91,7 +85,7 @@ InfiPlot 会与四类模型供应商通信。**文本（Text）和视觉（Visio
 | Text · 剧情导演  | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL`        | ✅ | DeepSeek 的 `deepseek-v4-flash` |
 | Image · 场景渲染  | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL`     | ✅ | [Runware](https://runware.ai) 的 `runware:400@6`（FLUX.2 [klein] 9B KV） |
 | Vision · 点击解读  | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL`  | ✅ | Google 的 `gemini-3.5-flash` |
-| TTS · 角色配音 | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | 可选 —— 留空则静音运行 | 小米 MiMo 的 `mimo-v2.5-tts`（自有协议，非 OpenAI 兼容） |
+| TTS · 角色配音 | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | 可选 —— 留空则静音运行 | 小米 MiMo 的 `mimo-v2.5-tts` |
 **2. 填写环境变量**
@@ -105,44 +99,24 @@ InfiPlot 会与四类模型供应商通信。**文本（Text）和视觉（Visio
 **3. 注意成本**
-使用推荐的三件套时，每一幕**场景**的开销主要来自文本 LLM 调用。FLUX.2 [klein] 9B KV 的图像大约 **\$0.00078** 一张（1792×1024，4 步，亚秒级）；其余主要是文本调用（使用 `deepseek-v4-flash` 模型时成本极低）。逐拍点过一个场景是免费的。为了让切换瞬间完成，引擎还会**预先生成那些你可能选、但最终没选的场景** —— 所以真实花费会比你实际看到的场景数略高一些。开箱状态下没有任何限流或鉴权 —— 如果你把部署公开出去，账单会如实反映这一点。在大范围分享之前，请先加上限流（并酌情降低预取深度）。
+使用推荐的三件套时，每一幕场景的开销主要来自图像生成模型。FLUX.2 [klein] 9B KV 的图像大约 **$0.00078** 一张（1792×1024，4 步，亚秒级）；文本模型使用 `deepseek-v4-flash` 时，成本极低。逐拍点过一个场景是免费的。为了让切换瞬间完成，引擎还会预测式地生成那些你可能选、但最终可能没选的场景 —— 所以真实花费会比你实际看到的场景数略高一些。
 ---
-## 路线图
+## Roadmap
- [ ] 用户感知不到的延迟
+- [ ] 让用户感知不到生成延迟
 - [ ] 兼容更多模型 provider
 - [ ] 游玩过程中支持用户自定义输入
 - [ ] 移动端浏览器适配
 - [ ] 兼容大多数 provider
 - [ ] 用户注册登录系统
 - [ ] 由静态图升级为动态视频
 - [ ] 游玩过程中支持用户自定义输入
 - [ ] 语音交互
- [ ] 分享故事的功能
+- [ ] 分享正在游玩的故事
 - [ ] 移动端 app
 ---
 ## 参与贡献
 欢迎提交 Issue 和 Pull Request。在本地运行 InfiPlot（需要 Node 20+ 和 pnpm 9+）：
 ```bash
 pnpm install
 cp apps/web/.env.example apps/web/.env.local   # 填入你的密钥 —— 参见「配置教程」
 pnpm dev                                        # 打开 http://localhost:3000
 ```
 提交贡献即表示你同意你的贡献以 AGPL-3.0 协议授权。
 ---
 ## Star 趋势
 [![Star History Chart](https://api.star-history.com/svg?repos=zonghaoyuan/infiplot&type=Date)](https://star-history.com/#zonghaoyuan/infiplot&Date)
 ---
 ## 许可证
 [AGPL-3.0](LICENSE) © InfiPlot。内核完全开源。AGPL 的「网络使用即分发」条款意味着：任何人若将修改后的版本作为网络服务运行，也必须公开其源代码 —— 这既让内核保持开放，又为未来的托管版或商业版本留下了空间。