Squash-merge the cloudflare-migration branch (7 commits by Kai ki) into
staging with conflict resolution, feature integration, and bug fixes.
Engine:
- Paradigm D: single-stream Writer replacing dual-phase Plan/Beats
- Delete Architect agent; story bible generated via Writer <plan> tag
- Modular prompt architecture (segments/registry/builder)
- StreamRouter for tagged stream splitting (<plan>/<story>/<choices>)
Infrastructure:
- Cloudflare Workers deployment (wrangler.jsonc, OpenNext adapter)
- D1 database schema + Drizzle ORM (scaffolded, not yet active)
- R2 storage helpers (scaffolded, not yet active)
- Story persistence API routes + client-side persistence
BYOK (Bring Your Own Key):
- /api/llm/user-proxy with SSRF-protected LLM proxy (+ requireUser auth)
- CORS-aware fetch in ai-client: auto-detect CORS failure, fallback to
server proxy transparently via OpenAI SDK custom fetch
- BYO config support added to classify-freeform and vision routes
- SettingsModal CORS privacy notice (keys never logged/stored)
SSE streaming:
- engineClient.ts: fetchSSE helper for progressive scene events
- startSession/requestScene accept optional emit callback
- Fix SSE error event field name (error → message) in scene/start routes
i18n integration:
- Wire buildLanguageDirective into paradigm D's prompt builder
- Update corsNotice i18n keys (zh-CN/en/ja) with CORS proxy privacy text
- Preserve Session.language + LanguageSwitcher from i18n commit
Co-authored-by: Kai ki <155355644+zbf1009@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Three follow-ups to ef3b579 (OpenAI SDK migration) and ebe39ef (canvas frame):
- .env.example / config.ts / AGENTS.md: anthropic & google native protocols
were removed with the Vercel AI SDK, but .env.example and AGENTS.md still
advertised them. Rewrite the docs to point Claude/Gemini at their
OpenAI-compatible endpoints (api.anthropic.com/v1,
generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini
"Nano Banana" image example, sync AGENTS.md (text/vision protocol list,
image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and
append a short hint in readProvider() error message guiding
anthropic/google users to openai_compatible instead of a bare rejection.
- chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read
cached_tokens straight off the SDK's CompletionUsage type. Add a comment
noting the OpenAI usage object reports cache reads only (no cache-write
count), so the create cost the old AI SDK path logged is unrecoverable.
- PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}.
The gpt-image/mock paths emit multi-MB data URIs; using the full string as
React's reconciliation key adds avoidable diff overhead during the frequent
re-renders. Matches the existing <audio> element's key convention.
Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16
`next lint` CLI issue, identical on staging — unrelated to this change.)
De-duplicate the provider switch logic that was identical in chat.ts
and vision.ts into a shared model.ts module.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Eliminate the dual code path (raw fetch vs AI SDK) for text and vision.
All providers now go through createLanguageModel() + generateText(),
removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage
type, summarizeUsage, and responseFormat plumbing from 8 call sites.
Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses);
DeepSeek only supports Chat Completions, so we use .chat() explicitly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK,
selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible)
- IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK
alongside the existing Runware task-array and OpenAI-compatible REST paths
- normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions);
append the per-protocol version segment only for bare hosts
- config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider?
- deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `tag` option to chat() and have it print one `[cache] <tag>
hit=X miss=Y rate=Z%` line per call. Three Usage-shape variants are
probed in order so the same logger works across providers:
- DeepSeek (v3+): usage.prompt_cache_hit_tokens / *_miss_tokens
- OpenAI / o-series: usage.prompt_tokens_details.cached_tokens
- Anthropic: usage.cache_read_input_tokens / *_creation_*
When none of them are present (MiMo / local Ollama / others) we still
print prompt + completion totals so the cost baseline is visible.
Tag every callsite so the log is greppable:
architect / writer / character-designer / cinematographer / insert-beat
This is the prerequisite for the prefix-cache reordering work that
follows — without per-agent visibility there's no way to tell if a
prompt rearrangement actually moved the needle.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.
- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>