Commit Graph

13 Commits

Author SHA1 Message Date
yuanzonghao ae3dd17e6b feat(web): add player name, freeform input, and unified settings modal
- Player name: stored in localStorage, injected into Architect/Writer/InsertBeat
  prompts so NPCs address the player by name, displayed in dialogue UI
- Freeform input: compact button at choice nodes expands to text input, LLM
  classifier routes to insert-beat (interactive NPC response) or change-scene
- SettingsModal: unified panel merging player name, voice toggle (with
  collapsible TTS key section), replacing the old TtsKeyModal
- Insert-beat upgrade: prompt now requires NPC reaction when characters are
  present, shared by both freeform and Vision paths
- IME guard: isComposing check on freeform input to prevent CJK mid-composition
  submission

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 12:37:50 +08:00
DESKTOP-I1T6TF3\Q b0b5630a25 feat(web): export interactive gallery + encrypted share file
Adds a "导出图集" action at the bottom-right of the play canvas that
snapshots the current session into localStorage and opens
/gallery#id=<id> in a new tab — the original play page keeps running
untouched. In parallel, sends the doc to /api/gallery-pack and
downloads the result as a binary .infiplot file the player can send
to a friend.

The snapshot pulls in:
  - Every visited scene's image + beat graph + recorded visit trail
  - All AI-prefetched alternate scenes (a new resolvedPrefetchesRef in
    PlayInner captures each prefetch as it resolves, so abandoned
    branches the engine already paid to generate are kept)
  - Character names + basePortraitUrl (voice base64 / styleReference
    are stripped — they aren't needed for replay)

/gallery is a no-network interactive replay:
  - Per-beat advance and per-choice navigation. Picked choices are
    highlighted; unpicked choices are clickable when an alternate was
    prefetched, greyed otherwise.
  - Stack-based navigation for stepping into branches with one-tap
    "返回主线" to collapse back to the main path.
  - Top-bar batch download for scene images (including unique
    AI-prefetched branch scenes, deduped against the main path) and
    character portraits. Fetched with a per-file AbortController + 20s
    timeout in a small concurrency pool, then clicked serially.
    Prevents one slow CDN response from stranding the busy button.
  - In-progress hint banner reminding the player to allow the
    browser's "multiple downloads" prompt.
  - F-key fullscreen with a top toolbar that auto-retracts after the
    initial glance and pops back down on cursor approach.
  - Per-scene dialogue panel (fa-clock-rotate-left, matching the
    in-game history affordance).
  - "导入分享文件" entry on the empty/error state — accepts a friend's
    .infiplot, posts to /api/gallery-unpack, renders the decrypted doc.

Share-file format (.infiplot):
  - AES-256-GCM via Web Crypto (portable to Cloudflare Workers).
  - Layout: 4-byte magic "IFPL" + 1-byte version + 12-byte nonce +
    ciphertext (includes 16-byte auth tag).
  - Key derived from GALLERY_SECRET via SHA-256.
  - GCM's auth tag gives tamper-detection for free; any flip in the
    ciphertext/nonce surfaces as "文件校验失败" — same error as wrong-key,
    so the distinction can't leak server config.
  - Stateless: server keeps no record of issued files.
  - GALLERY_SECRET unset → /api/gallery-pack returns 503, the play page
    silently skips the share-file download, local view still works.
    Rotating the secret invalidates every previously-issued file.

Retention: trimGalleryExports keeps only the 2 most recent localStorage
docs; older ones are evicted before each write so quota stays flat
regardless of how many times the player exports. Share files live on
the player's own disk — no retention concern.

Adds 'gallery_export' to the analytics event schema (scene_count only —
no free text).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 12:08:37 +08:00
yuanzonghao 57bc6556ab refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText
Eliminate the dual code path (raw fetch vs AI SDK) for text and vision.
All providers now go through createLanguageModel() + generateText(),
removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage
type, summarizeUsage, and responseFormat plumbing from 8 call sites.

Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses);
DeepSeek only supports Chat Completions, so we use .chat() explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 00:31:36 +08:00
yuanzonghao aed05a0512 fix(web): remove hardcoded maxDuration so Vercel dashboard setting takes effect
Code-level `export const maxDuration = 60` and vercel.json `functions`
block were overriding the dashboard's 300s setting, causing ~100 504
timeouts per day on /api/scene and /api/start. Removing them lets each
Vercel plan use its own default (60s Hobby, 300s Pro) without breaking
self-deployers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 18:18:09 +08:00
yuanzonghao d646ce8db8 refactor(web): remove client-side BYO API key feature
The BYO (Bring Your Own) API key configuration for LLM and image
generation will be re-implemented via Cloudflare Workers. Remove
the client-side implementation to prepare for that migration.

TTS (text-to-speech) BYO key support is intentionally preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 17:42:00 +08:00
yuanzonghao e88e988de3 fix(web): reduce FOT by stripping redundant voice data from transport
Three transport-only optimizations that cut per-session Vercel FOT by ~50-60%:

P0 — Server strips voice.referenceAudioBase64 from already-known characters
in /api/scene and /api/insert-beat responses (defense-in-depth).

P1 — Client strips all voice data from session before sending to
/api/scene, /api/vision, and /api/insert-beat. Voices are retained locally
and re-merged from responses via mergeCharactersPreserveVoice(). The engine
only needs character names + visualDescriptions for scene generation.

P3 — /api/beat-audio returns binary audio (Response with Content-Type)
instead of JSON-wrapped base64, saving ~33% encoding overhead. Client
converts to blob URLs; PlayCanvas accepts a single audioSrc prop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:24:34 +08:00
yuanzonghao f6226facbd fix(web): address PR #28 review — explicit clientTts boolean + BYO key prefix hint
Harden the BYO-mode signal at the API boundary (start/scene/insert-beat):
only clientTts === true drops server TTS, so a stray truthy non-boolean can't
silently disable it. Add a non-blocking prefix hint in TtsKeyModal that warns
when the pasted key prefix (tp-/sk-) mismatches the selected key type — a
mismatch hits the wrong endpoint and plays silently, the symptom BYO fixes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
yuanzonghao b0b2e922d3 feat(web): optional bring-your-own Xiaomi MiMo TTS key (browser-side synthesis)
Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits
cause silent playback under concurrency. This adds an OPTIONAL path: a user
can store their own Xiaomi MiMo key in the browser and synthesize voice
client-side against Xiaomi's CORS-open endpoints. The key lives only in
localStorage and is never sent to or logged by our server; the shared server
key still serves everyone who does not opt in.

- components/TtsKeyModal.tsx: shared key modal (key-family + region picker),
  reused by both the home and play pages
- app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens
  in place instead of redirecting to the home page
- app/page.tsx: home page consumes the shared modal + readStoredTtsConfig
- lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets
- app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update
- docs/xiaomi-tts-key.md + README note

Verified with tsc --noEmit (exit 0).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
DESKTOP-I1T6TF3\Q 592c82816a Revert "feat(loading): support typewriter story teaser during first scene generation"
This reverts commit 4e4e06ec8a.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 4e4e06ec8a feat(loading): support typewriter story teaser during first scene generation 2026-06-04 14:40:35 +08:00
DESKTOP-I1T6TF3\Q e04c51e875 feat(api): support custom BYO API header override on client fetches and backend config 2026-06-04 13:49:46 +08:00
DESKTOP-I1T6TF3\Q 347ab297d5 feat(web,engine): custom style — image upload, AI-extract prompt, painter ref
自定义画风入口里加上传按钮:客户端把图缩到 512px webp(base64),传到新
路由 /api/parse-style-image,vision LLM 解析成英文 style prompt 回填 textarea;
图本身随 sessionStorage → /api/start → Session.styleReferenceImage 透传,
painter.collectReferenceImages 把它置于 slot 0,整局每一幕都作为 reference
图锚定画风(brush / color / mood),比 priorScene 优先级更高。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 19:15:19 +08:00
Zonghao Yuan dc5ecd60f6 refactor: flatten monorepo to single web package (#12)
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.

- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 00:55:45 +08:00