Files
infiplot-web/README.md
T
yuanzonghao 8eda27f241 chore: complete @yume → @infiplot rename (post-PR#9)
PR #9 已完成首页和 layout 的视觉品牌迁移,此 commit 补齐剩余的
技术性改名 —— workspace 包名、source import、localStorage 键、
CSS keyframe、内部 header logo、.env.example、README。

- @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile)
- localStorage/sessionStorage: yume:* → infiplot:*
  (含 PR #9 新增的 yume:hintClosed)
- CSS keyframe yume-ripple → infiplot-ripple
- new/play 页面 header logo "云梦" → "InfiPlot"
- 代码注释中的「云梦」style 形容词删除(layout.tsx, page.tsx)
- 根 package.json name + description(描述跟齐 staging
  "AI 实时交互剧情游戏")
- README: tagline / Vercel deploy URL / 目录树 / engine 描述

保留:prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm
placeholder 的「视觉小说画风」(图像模型识别的风格名词)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 09:27:00 +08:00

100 lines
5.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# InfiPlot
> A real-time AI-generated interactive story game — painted scene by scene. You talk and explore within a scene; when the story turns a corner, it paints the next. You click. It paints. The story unfolds.
---
## How it works
The story unfolds as a sequence of **scenes**. Each scene is one AI-painted background plus a short tree of **beats** — moments of narration, dialogue, and the occasional choice. You tap through a scene's beats and the image stays put; only when a choice leads somewhere genuinely new — another place, a new point of view, a jump in time — does the AI paint the next scene.
```
entering a scene
1. Text LLM directs the whole scene at once — a background prompt
plus a tree of beats (narration / dialogue / choices)
2. Image model paints the background once, 16:9, no UI baked in
[ tap through beats — no model calls, instant ]
├─ in-scene choice ──────▶ jump to another beat (instant)
└─ scene-change choice ──▶ the next scene
(usually pre-generated — see below)
```
While you're reading one scene, the engine **speculatively generates the scenes your choices could lead to** — and, for unavoidable next steps, the scene after that. By the time you pick a direction, its image is usually already painted, so the cut feels instant.
Clicking the background itself (not a button) routes through a **vision** model: it reads where you tapped and decides whether you're exploring the current scene (it inserts a beat — no new image) or moving on (a new scene).
There is no traditional game UI baked into the art. The AI paints the world in whatever style you pick — "stick figure on grid paper" or "cyberpunk noir" — and the dialogue panel and choice buttons are a light HTML layer drawn on top, tuned to sit over the scene.
---
## One-click deploy
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https://github.com/zonghaoyuan/infiplot&root-directory=apps/web&env=TEXT_BASE_URL,TEXT_API_KEY,TEXT_MODEL,IMAGE_BASE_URL,IMAGE_API_KEY,IMAGE_MODEL,VISION_BASE_URL,VISION_API_KEY,VISION_MODEL,TTS_BASE_URL,TTS_API_KEY,TTS_SPEECH_MODEL,MOCK_IMAGE&envDescription=Three%20required%20providers%20%2B%20optional%20TTS.%20Any%20OpenAI-compatible%20endpoint%20works%20for%20text%2Fvision%2Ftts.&envLink=https://github.com/zonghaoyuan/infiplot%23environment-variables)
After deploy, set the environment variables (see below) in your Vercel project. Nine are required; TTS is optional (leave blank to run silently); `MOCK_IMAGE=true` skips image generation for cheap TTS-only testing. The Vercel project's **Root Directory** must be set to `apps/web` (the deploy button passes this; if you configure manually, set it in Project Settings).
---
## Environment variables
Three required providers + optional TTS. Text, Vision, and TTS accept any OpenAI-compatible endpoint (OpenAI, Anthropic via OpenAI-compat proxy, Gemini, OpenRouter, DeepSeek, local Ollama, …). Image goes to **Runware** (its own task-array protocol, not OpenAI-compatible).
| Provider | Variables | Required? | Recommended |
|---|---|---|---|
| Text · story director | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL` | ✅ | `claude-opus-4-7` via Anthropic |
| Image · UI renderer | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL` | ✅ | `runware:400@6` (FLUX.2 [klein] 9B KV) via [Runware](https://runware.ai) |
| Vision · click reader | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL` | ✅ | `gemini-3-flash` via Google |
| TTS · per-character voice | `TTS_BASE_URL` `TTS_API_KEY` `TTS_SPEECH_MODEL` | optional — leave blank to run silently | `mimo-v2.5-tts` via Xiaomi MiMo |
There's also a flag for cheap testing:
| Variable | Effect |
|---|---|
| `MOCK_IMAGE=true` | Skip image generation; the renderer returns a static placeholder. Story, voice, and choices still run normally. Great for iterating on TTS without burning Runware credits. |
See `apps/web/.env.example` for the exact shape.
---
## Local development
Requires Node 20+ and pnpm 9+.
```bash
pnpm install
cp apps/web/.env.example apps/web/.env.local
# fill in env vars (9 required + optional TTS/MOCK_IMAGE)
pnpm dev
# open http://localhost:3000
```
---
## Project layout
```
infiplot/
├── apps/web/ Next.js 16 app — pages + API routes (Vercel root)
└── packages/
├── types/ shared TypeScript types
├── ai-client/ unified OpenAI-compatible clients + Runware adapter
├── tts-client/ Xiaomi MiMo TTS adapter
└── engine/ multi-agent AI orchestration (open core)
```
`packages/engine` is the open core — pure TS, no Next.js or browser dependency. Import it directly to build your own interactive-narrative front-end (Tauri, Electron, CLI, anywhere).
---
## Cost & limits
With the recommended trio, each **scene** is dominated by the text-LLM call. The FLUX.2 [klein] 9B KV image is roughly **\$0.001** per scene (1792×1024, 4 steps, sub-second); the text call is the rest. Tapping through a scene's beats is free. To keep transitions instant, the engine also **pre-generates scenes you might pick but don't** — so real spend runs somewhat higher than the scenes you actually see. There is no rate limiting or auth out of the box — if you make your deployment public, your bill will reflect that. Add limits (and consider lowering the prefetch depth) before sharing widely.