b805b1d9c2
Symptom: in Chrome on certain networks the scene <img> renders row-by-row
from top to bottom — "层层加载" — instead of appearing atomically.
Root cause (confirmed via DevTools):
- Chrome opportunistically opens HTTP/3 (QUIC) to im.runware.ai.
- QUIC streams to Runware sometimes error mid-transfer:
net::ERR_QUIC_PROTOCOL_ERROR
HTTP-level status stays 200 (response headers received), but bytes are
truncated. The browser paints whatever PNG bytes it has so far → visible
row-by-row decode.
- The earlier preloadImage()+decode() trick can't fix this — neither
HTTP-cache reuse nor sync decode helps when the bytes themselves were
never fully delivered.
Two-tier fix:
1. Client: fetch → Blob → URL.createObjectURL() (app/play/page.tsx)
- <img src> only ever points to a blob: URL whose bytes are 100%
resident in the JS heap. No network-backed src = no possibility of
progressive paint.
- Module-level blobUrlCache keys by original URL so speculative
prefetch + the eventual commit share one fetch.
- Old blobs are URL.revokeObjectURL()'d on scene swap + unmount to
release memory.
2. Network: optional Cloudflare Worker proxy (worker/)
- Browser ↔ Worker is HTTP/2 over CF edge (extremely stable).
- Worker ↔ Runware is a server-to-server fetch (no QUIC fragility,
Cloudflare's backbone handles transit).
- Worker buffers the full upstream response → client never sees a
half-stream.
- Bonus: CF edge cache (cacheEverything, 1y TTL) on Runware UUIDs;
Access-Control-Allow-Origin: * so client fetch() can't hit CORS.
- Hardened: only proxies im.runware.ai, only GET/HEAD/OPTIONS, all
other hosts/methods → 403/405.
Wired via NEXT_PUBLIC_IMAGE_PROXY_URL (inlined at build). Empty → no proxy
→ direct fetch (which still uses the blob path, just exposed to QUIC).
──────────────────────────────────────────────────────────────────────
Deploy steps (one-time, do this AFTER pulling this commit):
1. Install wrangler globally:
npm i -g wrangler
2. Log in to Cloudflare (opens browser for OAuth):
wrangler login
3. From the worker/ directory, deploy:
cd worker
wrangler deploy
wrangler will print the deployed URL, e.g.
https://infiplot-image-proxy.<your-cf-username>.workers.dev
4. Paste that URL into .env.local for local dev:
NEXT_PUBLIC_IMAGE_PROXY_URL=https://infiplot-image-proxy.<...>.workers.dev
…and into Vercel project settings (Environment Variables) for prod.
NEXT_PUBLIC_ vars are inlined at build time, so the URL bakes into
the bundle on the next deploy/dev-server restart.
5. Restart dev server (pnpm dev) so the new env baked in. Generate a
scene; Network tab should show requests going to *.workers.dev
instead of im.runware.ai, no ERR_QUIC_PROTOCOL_ERROR, image renders
atomically.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
80 lines
4.1 KiB
Bash
80 lines
4.1 KiB
Bash
# =============================================================
|
|
# InfiPlot — AI 实时交互剧情游戏
|
|
# Recommended setup: Xiaomi MiMo Token Plan for TEXT / VISION / TTS
|
|
# (one API key covers all three) + Runware for IMAGE (FLUX.2 [klein]).
|
|
#
|
|
# TEXT / VISION use any OpenAI-compatible endpoint (any OpenAI-
|
|
# compatible host works: OpenRouter, OpenAI, Anthropic via proxy,
|
|
# Gemini, DeepSeek, Ollama, ...).
|
|
# TTS uses Xiaomi MiMo's own voice design / clone protocol
|
|
# (not OpenAI-compatible; appends -voicedesign / -voiceclone).
|
|
#
|
|
# IMAGE uses Runware's own task-array protocol (not OpenAI-compatible);
|
|
# the adapter posts an `imageInference` task to IMAGE_BASE_URL.
|
|
# =============================================================
|
|
|
|
# ---- 1. Text LLM · scene director ----------------------------------
|
|
# Any OpenAI-compatible endpoint works: OpenAI, Anthropic (via proxy),
|
|
# Gemini, OpenRouter, DeepSeek, OpenCode, MiMo, local Ollama, …
|
|
# Recommended starters:
|
|
# A. DeepSeek v4-flash direct (https://api.deepseek.com/v1) — pay-as-you-go,
|
|
# fastest first-token latency, very stable JSON output.
|
|
# B. OpenCode Go (https://opencode.ai/zen/go/v1) — $10/mo flat-rate bundle of
|
|
# 12 open-source models (DeepSeek v4-flash, Qwen, Kimi, GLM, MiMo, …).
|
|
# Cheaper at high volume, slower at the tail.
|
|
# C. MiMo v2.5 via Xiaomi Token Plan — bundles VISION + TTS in one tp- key.
|
|
TEXT_BASE_URL=https://api.deepseek.com/v1
|
|
TEXT_API_KEY=sk-xxx
|
|
TEXT_MODEL=deepseek-v4-flash
|
|
|
|
# ---- 2. Image generator (renders the scene background) -------------
|
|
# Recommended: Runware + FLUX.2 [klein] 9B KV — distilled 4-step model,
|
|
# sub-second inference at ~$0.0008/image. Sign up at https://runware.ai
|
|
# AIR ids for FLUX.2 [klein] variants:
|
|
# runware:400@1 · 4B (smaller)
|
|
# runware:400@6 · 9B KV (recommended — fastest at 16:9)
|
|
IMAGE_BASE_URL=https://api.runware.ai/v1
|
|
IMAGE_API_KEY=runware-xxx
|
|
IMAGE_MODEL=runware:400@6
|
|
|
|
# ---- 3. Vision model · multimodal click interpretation -------------
|
|
# Recommended: MiMo V2.5 — multimodal, accepts image_url content parts.
|
|
VISION_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
|
VISION_API_KEY=tp-xxx
|
|
VISION_MODEL=mimo-v2.5
|
|
|
|
# ---- 4. TTS · Xiaomi MiMo (optional — leave blank to disable) ------
|
|
# Per-character voice design → clone, with per-line delivery direction.
|
|
# Voice identity = the reference audio kept in the session (no server expiry).
|
|
# The adapter appends -voicedesign / -voiceclone to TTS_SPEECH_MODEL.
|
|
TTS_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
|
TTS_API_KEY=tp-xxx
|
|
TTS_SPEECH_MODEL=mimo-v2.5-tts
|
|
|
|
# ---- 5. MOCK_IMAGE — skip image generation (cheap TTS testing) -----
|
|
# true → return a placeholder image instead of calling the image model.
|
|
# Text/story/voice still run normally. Great for iterating on TTS.
|
|
MOCK_IMAGE=false
|
|
|
|
# ---- 5b. Image proxy (Cloudflare Worker, optional) -----------------
|
|
# Chrome's direct fetch of im.runware.ai is unreliable on some networks
|
|
# (ERR_QUIC_PROTOCOL_ERROR mid-stream → partial bytes → <img> renders
|
|
# progressively from top to bottom). Routing the fetch through a CF Worker
|
|
# (see worker/) avoids the QUIC fragility and adds edge caching + CORS.
|
|
# Empty → no proxy → direct fetch (fine when the network behaves).
|
|
# NEXT_PUBLIC_ vars are inlined at BUILD time — set in Vercel project settings.
|
|
# Deploy the Worker per worker/wrangler.toml, then paste the workers.dev URL:
|
|
NEXT_PUBLIC_IMAGE_PROXY_URL=
|
|
|
|
# ---- 6. Analytics · Umami (optional — leave blank to disable) ------
|
|
# Privacy-friendly, cookieless page-view stats — no Cookie consent banner.
|
|
# Cloud: sign up at https://cloud.umami.is, add your site, copy its ID into
|
|
# NEXT_PUBLIC_UMAMI_WEBSITE_ID and use the cloud script URL:
|
|
# NEXT_PUBLIC_UMAMI_SRC=https://cloud.umami.is/script.js
|
|
# Self-host later: point SRC at your own instance — the integration is identical
|
|
# (no code change), e.g. NEXT_PUBLIC_UMAMI_SRC=https://stats.example.com/script.js
|
|
# Both blank → no script is injected (zero tracking). NEXT_PUBLIC_ vars are
|
|
# inlined at BUILD time, so set them in the build env (Vercel project settings).
|
|
NEXT_PUBLIC_UMAMI_SRC=
|
|
NEXT_PUBLIC_UMAMI_WEBSITE_ID=
|