docs(readme): restructure for star conversion optimization

- Restructure overview with scannable bullet lists for capabilities
- Move screenshots up (after overview), reduce from 14 to 6
- Extract configuration guide to docs/configuration{,.en,.ja}.md
- Update Vercel deploy button envLink to point to new config docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-06-29 12:16:03 +08:00
parent c66ee38ddd
commit 7dac77e200
6 changed files with 340 additions and 334 deletions
+62 -108
View File
@@ -27,9 +27,22 @@ InfiPlot is an interactive story game with content generated by AI in real time.
In one line: what we're building is an AI-generated, real-time take on *Love Is All Around* (《完蛋!我被美女包围了!》).
Whether you're a six-year-old, a twenty-something, thirty-five, or sixty, there's a fantasy here that belongs to you and you alone:
Whatever your age, there's a fantasy here that belongs to you alone:
Learn magic in the world of Harry Potter; become the one everyone at school adores and confesses to; publish paper after paper in top journals and conferences with grant money to spare; step into *Empresses in the Palace* and live out the court intrigue; or return to your younger self and make a different choice about something you regret…
- Learn magic in the world of Harry Potter
- Become the one everyone at school adores and confesses to
- Publish top-tier papers and never run out of grant money
- Step into *Empresses in the Palace* and live out court intrigue
- Return to your younger self and choose differently about something you regret
- ……
Core capabilities:
- **Multi-agent collaboration** — Writer, Character Designer, Cinematographer, and Painter work together to keep story coherent and characters consistent
- **Speculative generation** — by the time you choose, the next scene is usually already painted; transitions feel instant
- **Click to explore** — tap anywhere on the scene; a vision model interprets your intent and responds
- **AI voice acting** — every character gets a unique voice, via Xiaomi MiMo (free) or StepFun (paid, higher quality)
- **Any art style** — stick figures, cyberpunk, watercolor, manga… generate in whatever style you want
---
@@ -39,75 +52,19 @@ Free to play, no setup required: [infiplot.com](https://infiplot.com)
---
## Deploy
InfiPlot offers multiple deployment options. For personal use, we recommend the one-click Vercel deploy; to self-host on your own server or local machine, use Docker.
### OpenDeploy / Vercel / Cloudflare (one-click)
Cloudflare deployment requires the Workers Paid Plan because the scene pipeline needs longer CPU time. OpenDeploy lets your AI agent handle the deployment for you.
<a href="https://opendeploy.dev/github/zonghaoyuan/infiplot"><img src="https://oss.opendeploy.dev/static/deploy-with-your-agent.svg" alt="Deploy with your agent" height="34"></a>&nbsp;
<a href="https://vercel.com/new/clone?repository-url=https://github.com/zonghaoyuan/infiplot&env=TEXT_BASE_URL,TEXT_API_KEY,TEXT_MODEL,IMAGE_BASE_URL,IMAGE_API_KEY,IMAGE_MODEL,VISION_BASE_URL,VISION_API_KEY,VISION_MODEL,TTS_BASE_URL,TTS_API_KEY,TTS_SPEECH_MODEL,MOCK_IMAGE&envDescription=Three%20required%20providers%20%2B%20optional%20TTS.%20Any%20OpenAI-compatible%20endpoint%20works%20for%20text%2Fvision.%20TTS%3A%20Xiaomi%20MiMo%20%28free%29%20or%20StepFun%20%28paid%2C%20better%20quality%29.&envLink=https://github.com/zonghaoyuan/infiplot/blob/main/README.en.md%23configuration-guide"><img src="https://vercel.com/button" alt="Deploy with Vercel" height="34"></a>&nbsp;
<a href="https://deploy.workers.cloudflare.com/?url=https://github.com/zonghaoyuan/infiplot"><img src="https://deploy.workers.cloudflare.com/button" alt="Deploy to Cloudflare" height="34"></a>
After deploy, fill in the environment variables — see the [Configuration guide](#configuration-guide) below. The repo root is the app itself: Vercel needs no special root directory; on Cloudflare, just set the build command to `pnpm build:cf`.
### Docker (self-hosted)
For VPS, home servers, or local machines. Supports x86 and ARM (including Apple Silicon Macs). No need to clone the repo — just download two files:
```bash
mkdir -p infiplot && cd infiplot
curl -fsSL https://raw.githubusercontent.com/zonghaoyuan/infiplot/main/docker-compose.yml -o docker-compose.yml
curl -fsSL https://raw.githubusercontent.com/zonghaoyuan/infiplot/main/.env.example -o .env.example
[ -f .env.local ] || cp .env.example .env.local
```
Edit `.env.local` with your API keys (see [Configuration guide](#configuration-guide)), then start:
```bash
docker compose up -d
```
Visit `http://localhost:3000` to start playing.
> You can also run the image directly without Compose:
> ```bash
> docker run -d -p 3000:3000 --env-file .env.local ghcr.io/zonghaoyuan/infiplot:latest
> ```
---
## 📸 Screenshots
<table>
<tr>
<td><a href="docs/screenshots/1.webp"><img src="docs/screenshots/1.webp" width="420" alt="InfiPlot screenshot 1"></a></td>
<td><a href="docs/screenshots/2.webp"><img src="docs/screenshots/2.webp" width="420" alt="InfiPlot screenshot 2"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/3.webp"><img src="docs/screenshots/3.webp" width="420" alt="InfiPlot screenshot 3"></a></td>
<td><a href="docs/screenshots/4.webp"><img src="docs/screenshots/4.webp" width="420" alt="InfiPlot screenshot 4"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/5.webp"><img src="docs/screenshots/5.webp" width="420" alt="InfiPlot screenshot 5"></a></td>
<td><a href="docs/screenshots/6.webp"><img src="docs/screenshots/6.webp" width="420" alt="InfiPlot screenshot 6"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/7.webp"><img src="docs/screenshots/7.webp" width="420" alt="InfiPlot screenshot 7"></a></td>
<td><a href="docs/screenshots/8.webp"><img src="docs/screenshots/8.webp" width="420" alt="InfiPlot screenshot 8"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/9.webp"><img src="docs/screenshots/9.webp" width="420" alt="InfiPlot screenshot 9"></a></td>
<td><a href="docs/screenshots/10.webp"><img src="docs/screenshots/10.webp" width="420" alt="InfiPlot screenshot 10"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/11.webp"><img src="docs/screenshots/11.webp" width="420" alt="InfiPlot screenshot 11"></a></td>
<td><a href="docs/screenshots/12.webp"><img src="docs/screenshots/12.webp" width="420" alt="InfiPlot screenshot 12"></a></td>
</tr>
<tr>
<td><a href="docs/screenshots/13.webp"><img src="docs/screenshots/13.webp" width="420" alt="InfiPlot screenshot 13"></a></td>
<td><a href="docs/screenshots/14.webp"><img src="docs/screenshots/14.webp" width="420" alt="InfiPlot screenshot 14"></a></td>
</tr>
</table>
@@ -134,68 +91,43 @@ There is no traditional game UI baked into the art. The AI paints the world in w
---
## Team & Vision
## Deploy
We're a group of young people from Tsinghua University and other schools.
InfiPlot offers multiple deployment options. For personal use, we recommend the one-click Vercel deploy; to self-host on your own server or local machine, use Docker.
On one hand, we're longtime, devoted players of galgames, otome games, FMV, and AI role-play games. Even while enjoying them, we kept imagining how much more delightful and thrilling it would be if the story choices weren't fixed in advance — or if you could truly interact with an AI character in depth, instead of just texting it through a chat app.
### OpenDeploy / Vercel / Cloudflare (one-click)
On the other hand, we happen to know a little about large-model technology: enough to turn ideas into working software quickly with AI, and to have formed some modest views on the technical paths available and the limits of what today's tech can build.
Cloudflare deployment requires the Workers Paid Plan because the scene pipeline needs longer CPU time. OpenDeploy lets your AI agent handle the deployment for you.
The spark came on April 22, 2026, when [@zan2434](https://x.com/zan2434) and others released [flipbook](https://flipbook.page/). We were stunned and delighted by this entirely new form of interaction.
<a href="https://opendeploy.dev/github/zonghaoyuan/infiplot"><img src="https://oss.opendeploy.dev/static/deploy-with-your-agent.svg" alt="Deploy with your agent" height="34"></a>&nbsp;
<a href="https://vercel.com/new/clone?repository-url=https://github.com/zonghaoyuan/infiplot&env=TEXT_BASE_URL,TEXT_API_KEY,TEXT_MODEL,IMAGE_BASE_URL,IMAGE_API_KEY,IMAGE_MODEL,VISION_BASE_URL,VISION_API_KEY,VISION_MODEL,TTS_BASE_URL,TTS_API_KEY,TTS_SPEECH_MODEL,MOCK_IMAGE&envDescription=Three%20required%20providers%20%2B%20optional%20TTS.%20Any%20OpenAI-compatible%20endpoint%20works%20for%20text%2Fvision.%20TTS%3A%20Xiaomi%20MiMo%20%28free%29%20or%20StepFun%20%28paid%2C%20better%20quality%29.&envLink=https://github.com/zonghaoyuan/infiplot/blob/main/docs/configuration.en.md"><img src="https://vercel.com/button" alt="Deploy with Vercel" height="34"></a>&nbsp;
<a href="https://deploy.workers.cloudflare.com/?url=https://github.com/zonghaoyuan/infiplot"><img src="https://deploy.workers.cloudflare.com/button" alt="Deploy to Cloudflare" height="34"></a>
So one day in May, we agreed on the spot to build something like this — both to help people live out the fantasies they'd once set aside, and to explore the new modes of interaction that multimodal models make possible.
After deploy, set up your environment variables following the [Configuration guide](docs/configuration.en.md). The repo root is the app itself: Vercel needs no special root directory; on Cloudflare, just set the build command to `pnpm build:cf`.
The project is still very early and many features are far from polished. We'd love your feedback — open an [issue](https://github.com/zonghaoyuan/infiplot/issues), or join our dev team and explore the new possibilities with us, and satisfy your own curiosity.
### Docker (self-hosted)
Get in touch: hi@infiplot.com
For VPS, home servers, or local machines. Supports x86 and ARM (including Apple Silicon Macs). No need to clone the repo — just download two files:
Scan to join our **beta community on QQ** (group ID `575404333`) to share feedback and help shape the project:
```bash
mkdir -p infiplot && cd infiplot
curl -fsSL https://raw.githubusercontent.com/zonghaoyuan/infiplot/main/docker-compose.yml -o docker-compose.yml
curl -fsSL https://raw.githubusercontent.com/zonghaoyuan/infiplot/main/.env.example -o .env.example
[ -f .env.local ] || cp .env.example .env.local
```
<img src="public/qq-group.webp" alt="InfiPlot beta community QQ group QR code" width="200" />
Edit `.env.local` with your API keys (see [Configuration guide](docs/configuration.en.md)), then start:
---
```bash
docker compose up -d
```
## Configuration guide
Visit `http://localhost:3000` to start playing.
InfiPlot talks to four kinds of model providers. **Text and Vision use any OpenAI-compatible endpoint**, so you can mix and match freely — for Google Gemini, point `*_BASE_URL` at its OpenAI-compatible endpoint (`https://generativelanguage.googleapis.com/v1beta/openai`). For Anthropic Claude, a compatible gateway (e.g. LiteLLM) is recommended — Anthropic's official endpoint offers an OpenAI-compatible layer but no caching, which raises cost and latency. **Image** supports **Runware** (its own task-array protocol) and **OpenAI** (`gpt-image`). **TTS** supports **Xiaomi MiMo** (its own voice design / clone protocol — per-character voice design, clone, and per-line delivery direction; free) and **StepFun** (32 preset voices, auto-matched by AI; paid but better quality).
**1. Choose your providers**
| Provider | Variables | Required? | Recommended |
|---|---|---|---|
| Text · story director | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL` | ✅ | `deepseek-v4-flash` via DeepSeek |
| Image · scene renderer | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL` | ✅ | `runware:400@6` (FLUX.2 [klein] 9B KV) via [Runware](https://runware.ai) |
| Vision · click reader | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL` | ✅ | `gemini-3.5-flash` via Google |
| TTS · per-character voice | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | optional — leave blank to run silently | `mimo-v2.5-tts` via Xiaomi MiMo (free); paid alternative: `step-tts-2` via [StepFun](https://www.stepfun.com) |
**2. Set the environment variables**
Nine variables are required; TTS is optional (leave blank to run silently). There's also a flag for cheap testing:
| Variable | Effect |
|---|---|
| `MOCK_IMAGE=true` | Skip image generation; the renderer returns a static placeholder. Story, voice, and choices still run normally. Great for iterating on TTS without burning Runware credits. |
Where to set them (see `.env.example` for the exact shape):
- **Local dev** — `.env.local`
- **Vercel** — Project Settings → Environment Variables
- **Cloudflare Workers** — from the repo root, run `wrangler secret put <NAME>` for each variable, or set them in the dashboard (Workers → infiplot → Settings → Variables and Secrets). For a private staging instance, gate the Worker behind [Cloudflare Access](https://developers.cloudflare.com/cloudflare-one/applications/) — zero-code email-whitelist auth in front of the Worker.
**3. Mind the cost**
With the recommended trio, each scene's cost comes mainly from the image generation model. The FLUX.2 [klein] 9B KV image is roughly **\$0.00078** per scene (1792×1024, 4 steps, sub-second); the text model uses `deepseek-v4-flash`, so text costs are negligible by comparison. Tapping through a scene's beats is free. To keep transitions instant, the engine also pre-generates scenes you might pick but ultimately don't — so real spend runs somewhat higher than the scenes you actually see.
**4. Image proxy (optional)**
By default the browser fetches images directly from the provider — no setup needed; leave `NEXT_PUBLIC_IMAGE_PROXY_URL` blank and you're completely unaffected. You only want this if you hit progressive "top-to-bottom" image loading (Chrome's `ERR_QUIC_PROTOCOL_ERROR` on some networks paints partial PNGs row by row): deploy a tiny Cloudflare Worker that re-fetches images server-side and serves them atomically over HTTP/2. One-click deploy at **[infiplot-image-proxy](https://github.com/zonghaoyuan/infiplot-image-proxy)**, then paste the `workers.dev` URL it prints into `NEXT_PUBLIC_IMAGE_PROXY_URL`.
**5. Let players bring their own voice Key (optional, recommended)**
Xiaomi rate-limits the TTS model by RPM/TPM. When a public deployment has many people playing at once through a single shared `TTS_API_KEY`, those limits are easy to hit — the symptom is **story and visuals work fine, but there's no audio**. To fix this, players can optionally enter **their own** Xiaomi MiMo key on the homepage (free to obtain). Synthesis then runs **browser-direct to Xiaomi**, the **key stays in the player's browser and never touches your server**, and they get stable voice with lower latency. It's purely additive: leave it blank and playback falls back to your server key exactly as before.
See the [Bring-your-own voice Key guide](docs/xiaomi-tts-key.md) for how to obtain and enter one.
> You can also run the image directly without Compose:
> ```bash
> docker run -d -p 3000:3000 --env-file .env.local ghcr.io/zonghaoyuan/infiplot:latest
> ```
---
@@ -222,6 +154,28 @@ See the [Bring-your-own voice Key guide](docs/xiaomi-tts-key.md) for how to obta
---
## Team & Vision
We're a group of young people from Tsinghua University and other schools.
On one hand, we're longtime, devoted players of galgames, otome games, FMV, and AI role-play games. Even while enjoying them, we kept imagining how much more delightful and thrilling it would be if the story choices weren't fixed in advance — or if you could truly interact with an AI character in depth, instead of just texting it through a chat app.
On the other hand, we happen to know a little about large-model technology: enough to turn ideas into working software quickly with AI, and to have formed some modest views on the technical paths available and the limits of what today's tech can build.
The spark came on April 22, 2026, when [@zan2434](https://x.com/zan2434) and others released [flipbook](https://flipbook.page/). We were stunned and delighted by this entirely new form of interaction.
So one day in May, we agreed on the spot to build something like this — both to help people live out the fantasies they'd once set aside, and to explore the new modes of interaction that multimodal models make possible.
The project is still very early and many features are far from polished. We'd love your feedback — open an [issue](https://github.com/zonghaoyuan/infiplot/issues), or join our dev team and explore the new possibilities with us, and satisfy your own curiosity.
Get in touch: hi@infiplot.com
Scan to join our **beta community on QQ** (group ID `575404333`) to share feedback and help shape the project:
<img src="public/qq-group.webp" alt="InfiPlot beta community QQ group QR code" width="200" />
---
## Star history
[![Star History Chart](https://api.star-history.com/svg?repos=zonghaoyuan/infiplot&type=Date)](https://star-history.com/#zonghaoyuan/infiplot&Date)