fix(ai-client): clean up regressions from OpenAI SDK migration and canvas frame fix (#74)
Three follow-ups toef3b579(OpenAI SDK migration) andebe39ef(canvas frame): - .env.example / config.ts / AGENTS.md: anthropic & google native protocols were removed with the Vercel AI SDK, but .env.example and AGENTS.md still advertised them. Rewrite the docs to point Claude/Gemini at their OpenAI-compatible endpoints (api.anthropic.com/v1, generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini "Nano Banana" image example, sync AGENTS.md (text/vision protocol list, image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and append a short hint in readProvider() error message guiding anthropic/google users to openai_compatible instead of a bare rejection. - chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read cached_tokens straight off the SDK's CompletionUsage type. Add a comment noting the OpenAI usage object reports cache reads only (no cache-write count), so the create cost the old AI SDK path logged is unrecoverable. - PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}. The gpt-image/mock paths emit multi-MB data URIs; using the full string as React's reconciliation key adds avoidable diff overhead during the frequent re-renders. Matches the existing <audio> element's key convention. Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16 `next lint` CLI issue, identical on staging — unrelated to this change.)
This commit is contained in:
+21
-19
@@ -3,18 +3,22 @@
|
||||
# Recommended setup: Xiaomi MiMo Token Plan for TEXT / VISION / TTS
|
||||
# (one API key covers all three) + Runware for IMAGE (FLUX.2 [klein]).
|
||||
#
|
||||
# TEXT / VISION default to any OpenAI-compatible endpoint, and can switch to
|
||||
# native Anthropic or Google Gemini via TEXT_PROVIDER / VISION_PROVIDER.
|
||||
# TEXT / VISION / IMAGE all speak the OpenAI wire format. Anthropic Claude
|
||||
# and Google Gemini are reachable through their own OpenAI-compatible
|
||||
# endpoints (see TEXT_PROVIDER notes below) — no native protocol switch is
|
||||
# needed.
|
||||
# TTS uses Xiaomi MiMo's own voice design / clone protocol
|
||||
# (not OpenAI-compatible; appends -voicedesign / -voiceclone).
|
||||
#
|
||||
# IMAGE supports Runware (its own task-array protocol), OpenAI (gpt-image),
|
||||
# and Google Gemini (Nano Banana) via IMAGE_PROVIDER.
|
||||
# IMAGE supports Runware (its own task-array protocol) and OpenAI (gpt-image)
|
||||
# via IMAGE_PROVIDER.
|
||||
#
|
||||
# *_PROVIDER (optional) selects the wire protocol; leave unset for the
|
||||
# OpenAI-compatible default (image is auto-detected from the URL). Base URLs
|
||||
# tolerate a missing or extra /v1 (or a trailing /chat/completions) — the
|
||||
# engine normalizes them.
|
||||
# OpenAI-compatible default (image is auto-detected from the URL). Valid
|
||||
# values are openai_compatible / openai / runware — native "anthropic" /
|
||||
# "google" protocols were removed when the Vercel AI SDK was dropped.
|
||||
# Base URLs tolerate a missing or extra /v1 (or a trailing /chat/completions)
|
||||
# — the engine normalizes them.
|
||||
# =============================================================
|
||||
|
||||
# ---- 1. Text LLM · scene director ----------------------------------
|
||||
@@ -30,9 +34,11 @@
|
||||
TEXT_BASE_URL=https://api.deepseek.com/v1
|
||||
TEXT_API_KEY=sk-xxx
|
||||
TEXT_MODEL=deepseek-v4-flash
|
||||
# TEXT_PROVIDER: openai_compatible (default) | anthropic | google
|
||||
# anthropic → TEXT_BASE_URL=https://api.anthropic.com TEXT_MODEL=claude-sonnet-4-6
|
||||
# google → TEXT_BASE_URL=https://generativelanguage.googleapis.com TEXT_MODEL=gemini-3.5-flash
|
||||
# TEXT_PROVIDER: openai_compatible (default). This is the ONLY supported text
|
||||
# protocol. To use Claude or Gemini, leave TEXT_PROVIDER unset and point at
|
||||
# their OpenAI-compatible endpoints:
|
||||
# Claude → TEXT_BASE_URL=https://api.anthropic.com/v1 TEXT_MODEL=claude-sonnet-4-6
|
||||
# Gemini → TEXT_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai TEXT_MODEL=gemini-3.5-flash
|
||||
# TEXT_PROVIDER=openai_compatible
|
||||
|
||||
# ---- 2. Image generator (renders the scene background) -------------
|
||||
@@ -44,14 +50,10 @@ TEXT_MODEL=deepseek-v4-flash
|
||||
IMAGE_BASE_URL=https://api.runware.ai/v1
|
||||
IMAGE_API_KEY=runware-xxx
|
||||
IMAGE_MODEL=runware:400@6
|
||||
# IMAGE_PROVIDER: runware (auto-detected for runware.ai) | openai_compatible
|
||||
# | openai | google
|
||||
# IMAGE_PROVIDER: runware (auto-detected for runware.ai) | openai_compatible | openai
|
||||
# openai → gpt-image, supports referenceImages (character/scene continuity).
|
||||
# IMAGE_BASE_URL=https://api.openai.com IMAGE_MODEL=gpt-image-1
|
||||
# google → Gemini "Nano Banana" (Imagen is EOL 2026-06-24, do not use it).
|
||||
# IMAGE_BASE_URL=https://generativelanguage.googleapis.com
|
||||
# IMAGE_MODEL=gemini-2.5-flash-image
|
||||
# NOTE: openai/google return raw bytes → inlined as a data: URI for the session
|
||||
# NOTE: openai returns raw bytes → inlined as a data: URI for the session
|
||||
# (heavier per-call transport than Runware's UUID re-reference loop). Runware
|
||||
# stays fastest + cheapest for the scene-by-scene flow.
|
||||
# IMAGE_PROVIDER=runware
|
||||
@@ -77,9 +79,9 @@ IMAGE_MODEL=runware:400@6
|
||||
VISION_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
||||
VISION_API_KEY=tp-xxx
|
||||
VISION_MODEL=mimo-v2.5
|
||||
# VISION_PROVIDER: openai_compatible (default) | anthropic | google
|
||||
# anthropic → VISION_BASE_URL=https://api.anthropic.com VISION_MODEL=claude-sonnet-4-6
|
||||
# google → VISION_BASE_URL=https://generativelanguage.googleapis.com VISION_MODEL=gemini-3.5-flash
|
||||
# VISION_PROVIDER: openai_compatible (default). Only openai_compatible is
|
||||
# supported — reach Claude/Gemini via their OpenAI-compatible endpoints
|
||||
# (same base URLs as TEXT above). Leave unset to use the default.
|
||||
# VISION_PROVIDER=openai_compatible
|
||||
|
||||
# ---- 4. TTS (optional — leave blank to disable) --------------------
|
||||
|
||||
Reference in New Issue
Block a user