0dea2f8e36
Three follow-ups toef3b579(OpenAI SDK migration) andebe39ef(canvas frame): - .env.example / config.ts / AGENTS.md: anthropic & google native protocols were removed with the Vercel AI SDK, but .env.example and AGENTS.md still advertised them. Rewrite the docs to point Claude/Gemini at their OpenAI-compatible endpoints (api.anthropic.com/v1, generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini "Nano Banana" image example, sync AGENTS.md (text/vision protocol list, image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and append a short hint in readProvider() error message guiding anthropic/google users to openai_compatible instead of a bare rejection. - chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read cached_tokens straight off the SDK's CompletionUsage type. Add a comment noting the OpenAI usage object reports cache reads only (no cache-write count), so the create cost the old AI SDK path logged is unrecoverable. - PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}. The gpt-image/mock paths emit multi-MB data URIs; using the full string as React's reconciliation key adds avoidable diff overhead during the frequent re-renders. Matches the existing <audio> element's key convention. Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16 `next lint` CLI issue, identical on staging — unrelated to this change.)
175 lines
9.6 KiB
Bash
175 lines
9.6 KiB
Bash
# =============================================================
|
|
# InfiPlot — AI 实时交互剧情游戏
|
|
# Recommended setup: Xiaomi MiMo Token Plan for TEXT / VISION / TTS
|
|
# (one API key covers all three) + Runware for IMAGE (FLUX.2 [klein]).
|
|
#
|
|
# TEXT / VISION / IMAGE all speak the OpenAI wire format. Anthropic Claude
|
|
# and Google Gemini are reachable through their own OpenAI-compatible
|
|
# endpoints (see TEXT_PROVIDER notes below) — no native protocol switch is
|
|
# needed.
|
|
# TTS uses Xiaomi MiMo's own voice design / clone protocol
|
|
# (not OpenAI-compatible; appends -voicedesign / -voiceclone).
|
|
#
|
|
# IMAGE supports Runware (its own task-array protocol) and OpenAI (gpt-image)
|
|
# via IMAGE_PROVIDER.
|
|
#
|
|
# *_PROVIDER (optional) selects the wire protocol; leave unset for the
|
|
# OpenAI-compatible default (image is auto-detected from the URL). Valid
|
|
# values are openai_compatible / openai / runware — native "anthropic" /
|
|
# "google" protocols were removed when the Vercel AI SDK was dropped.
|
|
# Base URLs tolerate a missing or extra /v1 (or a trailing /chat/completions)
|
|
# — the engine normalizes them.
|
|
# =============================================================
|
|
|
|
# ---- 1. Text LLM · scene director ----------------------------------
|
|
# Any OpenAI-compatible endpoint works: OpenAI, Anthropic (via proxy),
|
|
# Gemini, OpenRouter, DeepSeek, OpenCode, MiMo, local Ollama, …
|
|
# Recommended starters:
|
|
# A. DeepSeek v4-flash direct (https://api.deepseek.com/v1) — pay-as-you-go,
|
|
# fastest first-token latency, very stable JSON output.
|
|
# B. OpenCode Go (https://opencode.ai/zen/go/v1) — $10/mo flat-rate bundle of
|
|
# 12 open-source models (DeepSeek v4-flash, Qwen, Kimi, GLM, MiMo, …).
|
|
# Cheaper at high volume, slower at the tail.
|
|
# C. MiMo v2.5 via Xiaomi Token Plan — bundles VISION + TTS in one tp- key.
|
|
TEXT_BASE_URL=https://api.deepseek.com/v1
|
|
TEXT_API_KEY=sk-xxx
|
|
TEXT_MODEL=deepseek-v4-flash
|
|
# TEXT_PROVIDER: openai_compatible (default). This is the ONLY supported text
|
|
# protocol. To use Claude or Gemini, leave TEXT_PROVIDER unset and point at
|
|
# their OpenAI-compatible endpoints:
|
|
# Claude → TEXT_BASE_URL=https://api.anthropic.com/v1 TEXT_MODEL=claude-sonnet-4-6
|
|
# Gemini → TEXT_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai TEXT_MODEL=gemini-3.5-flash
|
|
# TEXT_PROVIDER=openai_compatible
|
|
|
|
# ---- 2. Image generator (renders the scene background) -------------
|
|
# Recommended: Runware + FLUX.2 [klein] 9B KV — distilled 4-step model,
|
|
# sub-second inference at ~$0.0008/image. Sign up at https://runware.ai
|
|
# AIR ids for FLUX.2 [klein] variants:
|
|
# runware:400@1 · 4B (smaller)
|
|
# runware:400@6 · 9B KV (recommended — fastest at 16:9)
|
|
IMAGE_BASE_URL=https://api.runware.ai/v1
|
|
IMAGE_API_KEY=runware-xxx
|
|
IMAGE_MODEL=runware:400@6
|
|
# IMAGE_PROVIDER: runware (auto-detected for runware.ai) | openai_compatible | openai
|
|
# openai → gpt-image, supports referenceImages (character/scene continuity).
|
|
# IMAGE_BASE_URL=https://api.openai.com IMAGE_MODEL=gpt-image-1
|
|
# NOTE: openai returns raw bytes → inlined as a data: URI for the session
|
|
# (heavier per-call transport than Runware's UUID re-reference loop). Runware
|
|
# stays fastest + cheapest for the scene-by-scene flow.
|
|
# IMAGE_PROVIDER=runware
|
|
|
|
# Optional image-latency guards. BOTH default to OFF when unset — leaving
|
|
# them blank keeps the exact historical behavior, so self-hosted deploys are
|
|
# unaffected unless they opt in.
|
|
# IMAGE_TIMEOUT_MS — per-attempt hard deadline for image requests; a timed
|
|
# out attempt is retried like a 5xx. Recommended 30000 for Runware
|
|
# (healthy-day p99 is ~26-37s; Runware's own gateway 504s at ~55s).
|
|
# IMAGE_HEDGE_MS — scene-paint hedging: if the referenced scene paint has
|
|
# not finished after this many ms, race a second identical request and
|
|
# keep whichever finishes first (the loser is aborted, but the provider
|
|
# may still bill it). Rescues straggler tasks; never fires when the first
|
|
# attempt already failed (e.g. 429/503 saturation). Recommended 15000 for
|
|
# Runware (healthy-day p95). Do NOT set thresholds this low for providers
|
|
# that are normally slow (e.g. gpt-image takes 20-60s per image).
|
|
# IMAGE_TIMEOUT_MS=30000
|
|
# IMAGE_HEDGE_MS=15000
|
|
|
|
# ---- 3. Vision model · multimodal click interpretation -------------
|
|
# Recommended: MiMo V2.5 — multimodal, accepts image_url content parts.
|
|
VISION_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
|
VISION_API_KEY=tp-xxx
|
|
VISION_MODEL=mimo-v2.5
|
|
# VISION_PROVIDER: openai_compatible (default). Only openai_compatible is
|
|
# supported — reach Claude/Gemini via their OpenAI-compatible endpoints
|
|
# (same base URLs as TEXT above). Leave unset to use the default.
|
|
# VISION_PROVIDER=openai_compatible
|
|
|
|
# ---- 4. TTS (optional — leave blank to disable) --------------------
|
|
# Provider is auto-detected from TTS_BASE_URL host:
|
|
# *stepfun.com → StepFun (preset voices, keyword-scored selection)
|
|
# otherwise → Xiaomi MiMo (voicedesign + voiceclone)
|
|
#
|
|
# Xiaomi MiMo — per-character voice design → clone, with per-line delivery.
|
|
# TTS_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
|
# TTS_API_KEY=tp-xxx
|
|
# TTS_SPEECH_MODEL=mimo-v2.5-tts
|
|
#
|
|
# StepFun — 32 preset voices, auto-selected by gender + age + tone scoring.
|
|
# TTS_BASE_URL=https://api.stepfun.com/v1
|
|
# TTS_API_KEY=sk-xxx
|
|
# TTS_SPEECH_MODEL=step-tts-mini # or step-tts-2 / stepaudio-2.5-tts
|
|
TTS_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
|
TTS_API_KEY=tp-xxx
|
|
TTS_SPEECH_MODEL=mimo-v2.5-tts
|
|
|
|
# ---- 5. MOCK_IMAGE — skip image generation (cheap TTS testing) -----
|
|
# true → return a placeholder image instead of calling the image model.
|
|
# Text/story/voice still run normally. Great for iterating on TTS.
|
|
MOCK_IMAGE=false
|
|
|
|
# ---- 5b. Image proxy (Cloudflare Worker, OPTIONAL) -----------------
|
|
# Leave NEXT_PUBLIC_IMAGE_PROXY_URL blank (the default) and the browser
|
|
# fetches images directly from the provider — exactly as the app worked
|
|
# before this proxy existed. The ALLOWED_HOSTS value below is inert until
|
|
# a proxy URL is set, so you're completely unaffected; skip this section.
|
|
#
|
|
# Why you might want it: Chrome's direct fetch of im.runware.ai is unreliable
|
|
# on some networks (ERR_QUIC_PROTOCOL_ERROR mid-stream → partial bytes →
|
|
# <img> paints progressively top-to-bottom). Routing the fetch through a tiny
|
|
# Cloudflare Worker re-fetches server-to-server (no QUIC fragility) and serves
|
|
# over HTTP/2 — atomic paint, plus edge caching + CORS.
|
|
#
|
|
# Deploy your own in ~1 min (one-click "Deploy to Cloudflare" button):
|
|
# https://github.com/zonghaoyuan/infiplot-image-proxy
|
|
# Then paste the workers.dev URL it prints below. NEXT_PUBLIC_ vars are
|
|
# inlined at BUILD time — set them in Vercel/Cloudflare project settings.
|
|
NEXT_PUBLIC_IMAGE_PROXY_URL=
|
|
# Hostnames the proxy is allowed to fetch (comma-separated). Default covers
|
|
# Runware's CDN. If your IMAGE_BASE_URL points at another provider, add that
|
|
# provider's image host here so its URLs take the proxy path too. Anything
|
|
# not listed stays on the direct fetch. Only matters when the URL above is set.
|
|
NEXT_PUBLIC_IMAGE_PROXY_ALLOWED_HOSTS=im.runware.ai
|
|
|
|
# ---- 6. Analytics · Umami (optional — leave blank to disable) ------
|
|
# Privacy-friendly, cookieless page-view stats — no Cookie consent banner.
|
|
# Cloud: sign up at https://cloud.umami.is, add your site, copy its ID into
|
|
# NEXT_PUBLIC_UMAMI_WEBSITE_ID and use the cloud script URL:
|
|
# NEXT_PUBLIC_UMAMI_SRC=https://cloud.umami.is/script.js
|
|
# Self-host later: point SRC at your own instance — the integration is identical
|
|
# (no code change), e.g. NEXT_PUBLIC_UMAMI_SRC=https://stats.example.com/script.js
|
|
# Both blank → no script is injected (zero tracking; every track() call no-ops).
|
|
# Beyond page views the app emits content-free custom events (game start, scene
|
|
# reached, choice picked, ...) — only enums/counts/booleans, never your prompts,
|
|
# uploaded images or any per-user ID. The visitor's Do-Not-Track is honoured.
|
|
# NEXT_PUBLIC_ vars are inlined at BUILD time, so set them in the build env
|
|
# (Vercel project settings).
|
|
NEXT_PUBLIC_UMAMI_SRC=
|
|
NEXT_PUBLIC_UMAMI_WEBSITE_ID=
|
|
|
|
# Optional hostname allowlist — defense-in-depth on top of the blank-to-disable
|
|
# gate above. The tracker fires only when window.location.hostname EXACTLY
|
|
# matches an entry, so a fork that copied these vars stays silent on its own
|
|
# domain. Comma-separated, exact match: apex ≠ www (list both), no wildcards.
|
|
# Blank → track on all hosts. e.g. infiplot.com,www.infiplot.com
|
|
NEXT_PUBLIC_UMAMI_DOMAINS=
|
|
|
|
# ---- 7. Gallery share files (optional — leave blank to disable) ----
|
|
# Server-side secret used to AES-256-GCM encrypt a played session into a
|
|
# binary `.infiplot` share file the player can send to a friend. Friends drop
|
|
# the file into /gallery; the server decrypts and renders the same interactive
|
|
# replay. GCM's built-in auth tag also gives tamper-detection for free.
|
|
# Blank → "导出分享文件" is hidden, only the same-browser localStorage flow
|
|
# remains. Set to any high-entropy string ≥ 32 chars (e.g. `openssl rand -hex 32`).
|
|
# WARNING: rotating this secret invalidates every share file ever issued
|
|
# (decryption will fail with "文件校验失败"). Only change when you're OK with that.
|
|
GALLERY_SECRET=
|
|
|
|
# ---- 8. Auth · Supabase (optional — leave blank to disable) -------
|
|
# Sign up at https://supabase.com, create a project, copy the URL and
|
|
# publishable key (starts with sb_publishable_ or eyJ…).
|
|
# Both blank → login UI is completely absent, all API routes run unguarded,
|
|
# and the app behaves exactly as before this feature existed.
|
|
# NEXT_PUBLIC_ vars are inlined at BUILD time.
|
|
NEXT_PUBLIC_SUPABASE_URL=
|
|
NEXT_PUBLIC_SUPABASE_PUBLISHABLE_KEY=
|