Commit Graph

13 Commits

Author SHA1 Message Date
Zonghao Yuan 0dea2f8e36 fix(ai-client): clean up regressions from OpenAI SDK migration and canvas frame fix (#74)
Three follow-ups to ef3b579 (OpenAI SDK migration) and ebe39ef (canvas frame):

- .env.example / config.ts / AGENTS.md: anthropic & google native protocols
  were removed with the Vercel AI SDK, but .env.example and AGENTS.md still
  advertised them. Rewrite the docs to point Claude/Gemini at their
  OpenAI-compatible endpoints (api.anthropic.com/v1,
  generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini
  "Nano Banana" image example, sync AGENTS.md (text/vision protocol list,
  image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and
  append a short hint in readProvider() error message guiding
  anthropic/google users to openai_compatible instead of a bare rejection.

- chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read
  cached_tokens straight off the SDK's CompletionUsage type. Add a comment
  noting the OpenAI usage object reports cache reads only (no cache-write
  count), so the create cost the old AI SDK path logged is unrecoverable.

- PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}.
  The gpt-image/mock paths emit multi-MB data URIs; using the full string as
  React's reconciliation key adds avoidable diff overhead during the frequent
  re-renders. Matches the existing <audio> element's key convention.

Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16
`next lint` CLI issue, identical on staging — unrelated to this change.)
2026-06-14 13:36:19 +08:00
yuanzonghao e68e7e1690 feat(engine): add opt-in image timeout and scene-paint hedging
IMAGE_TIMEOUT_MS sets a per-attempt hard deadline (AbortSignal.timeout);
IMAGE_HEDGE_MS races a second identical scene-paint request when the
first is still pending past the threshold. Both default to OFF when
unset, preserving historical behavior for self-hosted deploys.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-13 11:21:47 +08:00
baizhi958216 ef3b57953b refactor(ai-client): replace AI SDK adapters with OpenAI SDK 2026-06-11 16:11:44 +08:00
yuanzonghao f4aca0b59c refactor(ai-client): extract shared createLanguageModel helper
De-duplicate the provider switch logic that was identical in chat.ts
and vision.ts into a shared model.ts module.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 11:55:55 +08:00
yuanzonghao 57bc6556ab refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText
Eliminate the dual code path (raw fetch vs AI SDK) for text and vision.
All providers now go through createLanguageModel() + generateText(),
removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage
type, summarizeUsage, and responseFormat plumbing from 8 call sites.

Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses);
DeepSeek only supports Chat Completions, so we use .chat() explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 00:31:36 +08:00
Zonghao Yuan c30d11d60b fix(security): harden BYO API header against SSRF and input abuse (#33)
* fix(security): harden BYO API header against SSRF and input abuse

- Add lib/validateUrl.ts with HTTPS-only + public-IP enforcement,
  provider allowlist, IPv6 rejection, and userinfo-in-URL blocking.
- Add lib/byoHeaders.ts — single source of truth for client-side BYO
  header construction (deduplicates app/page.tsx & app/play/page.tsx).
- config.ts: validate BYO endpoints via isPublicUrl(), cap header at
  2 KB, truncate apiKey/model strings, sanitize log output.
- fetchWithRetry: default redirect to "manual" to block 302-to-intranet.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): address Copilot review — trim endpoint, strip control chars, drop unused import

- safeEndpoint: trim whitespace before URL validation
- safeString: strip ASCII control characters to prevent header injection
- play/page.tsx: remove unused BYO_STORAGE_KEY import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:23:35 +08:00
yuanzonghao 9fc83de276 feat(web,engine): portrait-orientation scene images for mobile full-bleed
Thread orientation (portrait|landscape) from client through API, engine,
and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes;
desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image
coordinate mapping, session-locked orientation, a shared coerceOrientation
helper, and a choices overflow cap in portrait.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:30:54 +08:00
yuanzonghao 865bf322e9 fix(ai-client): parse Runware host by hostname; doc nits
- inferImageProtocol: match runware.ai by parsed hostname (exact match or
  subdomain) instead of a bare substring, so notrunware.ai /
  runware.ai.evil.com no longer misroute to the Runware protocol
- README: document the image-2-vip → OpenAI-compatible exception; correct the
  Imagen wording (deprecated, EOL 2026-06-24 — not yet discontinued)

Addresses Copilot review on #30.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
yuanzonghao 83fd5717e7 feat(ai-client): multi-provider compat — native Anthropic/Google + URL tolerance
- TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK,
  selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible)
- IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK
  alongside the existing Runware task-array and OpenAI-compatible REST paths
- normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions);
  append the per-protocol version segment only for bare hosts
- config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider?
- deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
DESKTOP-I1T6TF3\Q 347ab297d5 feat(web,engine): custom style — image upload, AI-extract prompt, painter ref
自定义画风入口里加上传按钮:客户端把图缩到 512px webp(base64),传到新
路由 /api/parse-style-image,vision LLM 解析成英文 style prompt 回填 textarea;
图本身随 sessionStorage → /api/start → Session.styleReferenceImage 透传,
painter.collectReferenceImages 把它置于 slot 0,整局每一幕都作为 reference
图锚定画风(brush / color / mood),比 priorScene 优先级更高。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 19:15:19 +08:00
DESKTOP-I1T6TF3\Q 37c911f510 chore(engine): log prompt-cache hit/miss per chat call
Add a `tag` option to chat() and have it print one `[cache] <tag>
hit=X miss=Y rate=Z%` line per call. Three Usage-shape variants are
probed in order so the same logger works across providers:

  - DeepSeek (v3+):  usage.prompt_cache_hit_tokens / *_miss_tokens
  - OpenAI / o-series: usage.prompt_tokens_details.cached_tokens
  - Anthropic:        usage.cache_read_input_tokens / *_creation_*

When none of them are present (MiMo / local Ollama / others) we still
print prompt + completion totals so the cost baseline is visible.

Tag every callsite so the log is greppable:
  architect / writer / character-designer / cinematographer / insert-beat

This is the prerequisite for the prefix-cache reordering work that
follows — without per-agent visibility there's no way to tell if a
prompt rearrangement actually moved the needle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 10:42:33 +08:00
DESKTOP-I1T6TF3\Q bed4dc5a8f feat(web): gender-differentiated 4:5 covers + per-card styleGuide prebake
- Regenerate 60 covers (30 male + 30 female) via FLUX with story-specific
  prompts, replacing the prior gender-shared set
- Crop covers to 4:5 (960×1200) via sharp attention cover; matches new
  homepage card aspectRatio
- Persist all 60 prompts to public/home/prompts.json so the prebake step
  can reuse the cover's exact visual anchor (per-card styleGuide) and the
  first-act scene visually carries over from the poster the player clicked
- Restore /play?card= prebaked instant-play path on homepage card click
- Add OpenAI-compatible image route in ai-client for non-Runware endpoints
- Hide Next.js dev indicators globally; tweak F-key fullscreen label

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 02:26:35 +08:00
Zonghao Yuan dc5ecd60f6 refactor: flatten monorepo to single web package (#12)
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.

- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 00:55:45 +08:00