8eda27f241
PR #9 已完成首页和 layout 的视觉品牌迁移,此 commit 补齐剩余的 技术性改名 —— workspace 包名、source import、localStorage 键、 CSS keyframe、内部 header logo、.env.example、README。 - @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile) - localStorage/sessionStorage: yume:* → infiplot:* (含 PR #9 新增的 yume:hintClosed) - CSS keyframe yume-ripple → infiplot-ripple - new/play 页面 header logo "云梦" → "InfiPlot" - 代码注释中的「云梦」style 形容词删除(layout.tsx, page.tsx) - 根 package.json name + description(描述跟齐 staging "AI 实时交互剧情游戏") - README: tagline / Vercel deploy URL / 目录树 / engine 描述 保留:prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm placeholder 的「视觉小说画风」(图像模型识别的风格名词)。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
100 lines
5.5 KiB
Markdown
100 lines
5.5 KiB
Markdown
# InfiPlot
|
||
|
||
> A real-time AI-generated interactive story game — painted scene by scene. You talk and explore within a scene; when the story turns a corner, it paints the next. You click. It paints. The story unfolds.
|
||
|
||
---
|
||
|
||
## How it works
|
||
|
||
The story unfolds as a sequence of **scenes**. Each scene is one AI-painted background plus a short tree of **beats** — moments of narration, dialogue, and the occasional choice. You tap through a scene's beats and the image stays put; only when a choice leads somewhere genuinely new — another place, a new point of view, a jump in time — does the AI paint the next scene.
|
||
|
||
```
|
||
entering a scene
|
||
│
|
||
▼
|
||
1. Text LLM directs the whole scene at once — a background prompt
|
||
plus a tree of beats (narration / dialogue / choices)
|
||
│
|
||
▼
|
||
2. Image model paints the background once, 16:9, no UI baked in
|
||
│
|
||
▼
|
||
[ tap through beats — no model calls, instant ]
|
||
│
|
||
├─ in-scene choice ──────▶ jump to another beat (instant)
|
||
│
|
||
└─ scene-change choice ──▶ the next scene
|
||
(usually pre-generated — see below)
|
||
```
|
||
|
||
While you're reading one scene, the engine **speculatively generates the scenes your choices could lead to** — and, for unavoidable next steps, the scene after that. By the time you pick a direction, its image is usually already painted, so the cut feels instant.
|
||
|
||
Clicking the background itself (not a button) routes through a **vision** model: it reads where you tapped and decides whether you're exploring the current scene (it inserts a beat — no new image) or moving on (a new scene).
|
||
|
||
There is no traditional game UI baked into the art. The AI paints the world in whatever style you pick — "stick figure on grid paper" or "cyberpunk noir" — and the dialogue panel and choice buttons are a light HTML layer drawn on top, tuned to sit over the scene.
|
||
|
||
---
|
||
|
||
## One-click deploy
|
||
|
||
[](https://vercel.com/new/clone?repository-url=https://github.com/zonghaoyuan/infiplot&root-directory=apps/web&env=TEXT_BASE_URL,TEXT_API_KEY,TEXT_MODEL,IMAGE_BASE_URL,IMAGE_API_KEY,IMAGE_MODEL,VISION_BASE_URL,VISION_API_KEY,VISION_MODEL,TTS_BASE_URL,TTS_API_KEY,TTS_SPEECH_MODEL,MOCK_IMAGE&envDescription=Three%20required%20providers%20%2B%20optional%20TTS.%20Any%20OpenAI-compatible%20endpoint%20works%20for%20text%2Fvision%2Ftts.&envLink=https://github.com/zonghaoyuan/infiplot%23environment-variables)
|
||
|
||
After deploy, set the environment variables (see below) in your Vercel project. Nine are required; TTS is optional (leave blank to run silently); `MOCK_IMAGE=true` skips image generation for cheap TTS-only testing. The Vercel project's **Root Directory** must be set to `apps/web` (the deploy button passes this; if you configure manually, set it in Project Settings).
|
||
|
||
---
|
||
|
||
## Environment variables
|
||
|
||
Three required providers + optional TTS. Text, Vision, and TTS accept any OpenAI-compatible endpoint (OpenAI, Anthropic via OpenAI-compat proxy, Gemini, OpenRouter, DeepSeek, local Ollama, …). Image goes to **Runware** (its own task-array protocol, not OpenAI-compatible).
|
||
|
||
| Provider | Variables | Required? | Recommended |
|
||
|---|---|---|---|
|
||
| Text · story director | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL` | ✅ | `claude-opus-4-7` via Anthropic |
|
||
| Image · UI renderer | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL` | ✅ | `runware:400@6` (FLUX.2 [klein] 9B KV) via [Runware](https://runware.ai) |
|
||
| Vision · click reader | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL` | ✅ | `gemini-3-flash` via Google |
|
||
| TTS · per-character voice | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | optional — leave blank to run silently | `mimo-v2.5-tts` via Xiaomi MiMo |
|
||
|
||
There's also a flag for cheap testing:
|
||
|
||
| Variable | Effect |
|
||
|---|---|
|
||
| `MOCK_IMAGE=true` | Skip image generation; the renderer returns a static placeholder. Story, voice, and choices still run normally. Great for iterating on TTS without burning Runware credits. |
|
||
|
||
See `apps/web/.env.example` for the exact shape.
|
||
|
||
---
|
||
|
||
## Local development
|
||
|
||
Requires Node 20+ and pnpm 9+.
|
||
|
||
```bash
|
||
pnpm install
|
||
cp apps/web/.env.example apps/web/.env.local
|
||
# fill in env vars (9 required + optional TTS/MOCK_IMAGE)
|
||
pnpm dev
|
||
# open http://localhost:3000
|
||
```
|
||
|
||
---
|
||
|
||
## Project layout
|
||
|
||
```
|
||
infiplot/
|
||
├── apps/web/ Next.js 16 app — pages + API routes (Vercel root)
|
||
└── packages/
|
||
├── types/ shared TypeScript types
|
||
├── ai-client/ unified OpenAI-compatible clients + Runware adapter
|
||
├── tts-client/ Xiaomi MiMo TTS adapter
|
||
└── engine/ multi-agent AI orchestration (open core)
|
||
```
|
||
|
||
`packages/engine` is the open core — pure TS, no Next.js or browser dependency. Import it directly to build your own interactive-narrative front-end (Tauri, Electron, CLI, anywhere).
|
||
|
||
---
|
||
|
||
## Cost & limits
|
||
|
||
With the recommended trio, each **scene** is dominated by the text-LLM call. The FLUX.2 [klein] 9B KV image is roughly **\$0.001** per scene (1792×1024, 4 steps, sub-second); the text call is the rest. Tapping through a scene's beats is free. To keep transitions instant, the engine also **pre-generates scenes you might pick but don't** — so real spend runs somewhat higher than the scenes you actually see. There is no rate limiting or auth out of the box — if you make your deployment public, your bill will reflect that. Add limits (and consider lowering the prefetch depth) before sharing widely.
|