infiplot-web

Author	SHA1	Message	Date
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	2d35c1d9de	feat(i18n): add language switcher with en/ja translations - New client-side i18n via React Context (useI18n, tArray, I18nProvider) - Catalog ships 21 locale stubs; only zh-CN/en/ja have reviewed translations - Header language switcher (globe icon + short label) before settings gear - All hardcoded Chinese UI text migrated to keys: typewriter, options, hints (with embedded gear icon via dangerouslySetInnerHTML), settings panel, footer/about, play page hints - AI output language follows user-selected locale via trailing one-liner directive appended to Architect/Writer/CharacterDesigner/InsertBeat user messages (preserves system-prompt cacheability) - Per-locale separator rule: zh uses middot between every glyph; en/ja use plain spaces - Option value → i18n key suffix maps preserve Chinese as the underlying identifier so analytics unions and STYLE_MAP keys stay byte-stable Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-18 16:54:35 +08:00
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	f1fe7964a2	feat(options): add third gender option "X" for universal gender - Add "X" to GENDERS array in lib/options.ts - Add example phrases for "X" gender (sci-fi themed) - Make "X" use same preset cards as male gender - Map "X" to "通用性别" when transmitting to AI - Add "X" to DISPLAY_ORDER (same as male) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-18 09:18:50 +08:00
yuanzonghao	d0faa06cc1	perf(image): switch Runware output format from PNG to WEBP WEBP produces ~90% smaller files than PNG at visually identical quality (tested: 5.4MB → ~550KB per 1792×1024 image), significantly reducing client download time for users on slower connections. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-15 18:04:31 +08:00
yuanzonghao	ba9f9c1342	Merge PR #79 : feat(tts): StepFun voice selection via CharacterDesigner + provider-aware beat-audio - StepFun voice selection: CharacterDesigner picks a preset voiceId from the 32-entry catalog (zero extra LLM call); pickStepfunVoiceId remains as fallback. - Prebaked homepage cards enriched with stepfunVoiceId (147 characters, gemini model). - /api/tts-provider endpoint + client probe: skip the ~220KB Xiaomi reference audio when the server runs StepFun (saves Fast Origin Transfer bandwidth). - Server-side resolveVoice normalization: re-provisions on provider mismatch. - Removed hardcoded 1.2x speech playback speed (was for slow MiMo voice). - Hardened voice-provider validation per PR-agent review. Xiaomi path prompt is byte-identical to history (prompt-cache-preserving).	2026-06-15 15:08:21 +08:00
yuanzonghao	65b7daff0b	fix(beat-audio): harden voice-provider validation and resolveVoice fast path Address PR-agent review findings: - resolveVoice fast path: replace ambiguous boolean comparison (voiceProvider === "stepfun") === serverStepfun with explicit per-provider equality checks. Prevents an undefined or unknown provider from matching the non-stepfun (xiaomi) branch by accident. - /api/beat-audio route: reject requests whose voice.provider is present but not in the VALID_TTS_PROVIDERS whitelist (e.g. "azure"). Previously such a request would pass validation when fallback fields were also present, and resolveVoice might use the invalid voice directly instead of falling back to reprovision — producing a silent beat instead of a voiced one.	2026-06-15 14:33:46 +08:00
yuanzonghao	3a012d46bf	fix(auth): harden snapshot paths per PR agent review Address two suggestions from the PR agent review: 1. lib/authResume.ts — catch isAuthed() exceptions in consumeResumeSnapshot. The network/timeout path now returns null (snapshot already removed earlier to prevent the play-page bootstrap's retryBootstrap loop from re-entering this path). Document the intentional removeItem-before-isAuthed ordering. 2. components/AuthModal.tsx — wrap onBeforeOAuth in try-catch so a snapshot failure (e.g. sessionStorage blocked in privacy mode) does not abort the OAuth flow and leave the UI stuck in loading.	2026-06-15 14:32:04 +08:00
yuanzonghao	8cdeb1592f	refactor(auth): share OAuth-resume plumbing between home and play pages Extract the page-agnostic resume primitives into lib/authResume.ts: - isAuthed() — single login check (was duplicated in app/page.tsx) - writeResumeSnapshot(key, primary, fallbacks) — quota-safe sessionStorage write with ordered lighter-payload fallbacks (was hand-rolledTry/catch in both pages) - consumeResumeSnapshot<T>(key) — consume-once resume gate that verifies the user is signed in before returning the snapshot, else clears it Both pages now share this plumbing while keeping their own snapshot shapes and restore side effects (home: form fields + start(); play: Session + restorePlayResume + deferred action replay). Unify the persist trigger: home previously snapshotted eagerly inside start() before opening the modal, while play snapshotted in AuthModal.onBeforeOAuth at redirect time. Move home to the same onBeforeOAuth trigger so both pages persist at the single OAuth-redirect instant — the eager-snapshot special case is gone, and OTP (no redirect) keeps its in-place onSuccess resume on both pages. Net: -21 lines. Behavior preserved for OTP; OAuth resume now consistent.	2026-06-15 14:03:14 +08:00
yuanzonghao	375f401c8f	fix(tts): persist stepfunVoiceId on Character + harden probe race Two follow-ups from pr-agent review of #79: 1. director.ts voicePromises built a Character WITHOUT stepfunVoiceId, so on a StepFun server the client (which omits the voice payload to save FOT) echoed back only voiceDescription — and the server re-scored via pickStepfunVoiceId every beat instead of honoring the LLM pick. The whole "CharacterDesigner picks a preset id" mechanism was effectively bypassed on live StepFun sessions (it only worked for prebaked cards, which carry stepfunVoiceId in their JSON). Persist stepfunVoiceId onto the Character so the client→server round-trip keeps the LLM selection. 2. fetchBeatAudio's null-provider branch (probe pending) required speaker.voice and silently dropped a stepfun-only speaker. Accept any synthesizable source (voice \| stepfunVoiceId \| voiceDescription) so a slow getTtsProvider probe can't drop audio during the first scene's fetch window. The server resolveVoice normalizes regardless of which fields arrive.	2026-06-15 13:05:36 +08:00
yuanzonghao	ca73a41a0b	feat(tts): StepFun voice selection via CharacterDesigner + provider-aware beat-audio Make homepage cards and live sessions produce sound when the server is configured for StepFun TTS, instead of silently failing (the prebaked Xiaomi voice was useless on a StepFun server, and wasted ~220KB/beat in Fast Origin Transfer). Three coordinated changes: 1. CharacterDesigner now picks a StepFun preset voice id directly from the 32-entry catalog in the SAME LLM call that designs the character — zero extra latency, LLM-grade match quality. The Xiaomi prompt path is byte-identical to history (verified programmatically) so cache hit rate and voice quality are preserved. pickStepfunVoiceId (keyword scorer) remains the fallback for orphan speakers / invalid LLM picks. 2. The 32-preset catalog moves to lib/tts-client/stepfun-voices.json as the single source of truth, shared by the scorer, the CharacterDesigner prompt, /api/tts-provider, and the offline enrich script. 3. A new GET /api/tts-provider endpoint lets the client probe the server's TTS provider at /play mount. fetchBeatAudio then shapes its request body: on a StepFun server it sends the lightweight stepfunVoiceId / voiceDescription and omits the ~220KB Xiaomi reference audio (FOT saving ~13MB per protagonist per session on prebaked cards). requestBeatAudio re-provisions on a provider mismatch before synth, so audio never goes silent on a cross-provider replay or mid-session provider flip. New type fields are all optional and backward-compatible: Character.stepfunVoiceId, BeatAudioRequest.voiceDescription/characterName/stepfunVoiceId, voice made optional. AGENTS.md updated for the new route, type fields, dependency map, and StepFun voice-selection flow.	2026-06-15 12:49:25 +08:00
Zonghao Yuan	0dea2f8e36	fix(ai-client): clean up regressions from OpenAI SDK migration and canvas frame fix (#74 ) Three follow-ups to `ef3b579` (OpenAI SDK migration) and `ebe39ef` (canvas frame): - .env.example / config.ts / AGENTS.md: anthropic & google native protocols were removed with the Vercel AI SDK, but .env.example and AGENTS.md still advertised them. Rewrite the docs to point Claude/Gemini at their OpenAI-compatible endpoints (api.anthropic.com/v1, generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini "Nano Banana" image example, sync AGENTS.md (text/vision protocol list, image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and append a short hint in readProvider() error message guiding anthropic/google users to openai_compatible instead of a bare rejection. - chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read cached_tokens straight off the SDK's CompletionUsage type. Add a comment noting the OpenAI usage object reports cache reads only (no cache-write count), so the create cost the old AI SDK path logged is unrecoverable. - PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}. The gpt-image/mock paths emit multi-MB data URIs; using the full string as React's reconciliation key adds avoidable diff overhead during the frequent re-renders. Matches the existing <audio> element's key convention. Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16 `next lint` CLI issue, identical on staging — unrelated to this change.)	2026-06-14 13:36:19 +08:00
yuanzonghao	2f6e67bd80	fix(play): restore server TTS, FOT strip/merge, nudge, and blob cleanup Reverts the regressions from `b63b694` on the server-fallback path: P0 — fetchBeatAudio non-BYO branch was a bare return; every non-BYO user got silent playback regardless of server TTS config. Re-connect to /api/beat-audio with the beatAudioAbortRef signal, count 204/!ok as silence strikes, create a blob URL on success. P1 — stripVoicesForTransport + mergeCharactersPreserveVoice were deleted, so the server-fallback path re-sent ~160KB referenceAudioBase64 per character on every request AND lost voices for already-known characters after scene 1. Re-add both, applied ONLY on the server-fallback branches in engineClient.ts (BYO client-direct path untouched). P3 — the aborted-before-store blob URL race had no revoke, leaking one blob URL per cancelled synth. Re-add the else-if revoke. P2 — handleSettingsSaved ignored ttsConfigured, so a BYO key entered mid-session only took effect after a page reload. Re-add the ref/state refresh + audio re-prefetch. Also restore the silence-nudge UI (silenceStrikes counter, SILENCE_NUDGE_THRESHOLD, dismissible pill beside the mute toggle) that surfaces BYO-key guidance when the shared server key is being rate-limited. Verified live: /api/beat-audio now returns 200 (was 0 calls under the bug); audio plays after synth completes.	2026-06-14 13:09:09 +08:00
yuanzonghao	cb830f023d	Merge origin/staging into feat/supabase-auth Resolve conflicts: keep login_success alongside the new play_error / play_visibility_lost analytics events; fold auth retry into the play-page catch blocks so 401s open the login modal and are NOT tracked as play_error. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-13 23:44:23 +08:00
yuanzonghao	89a5c54065	fix(auth): address PR review and OAuth state-loss bugs - proxy: await getUser() so refreshed session cookies land on the response - callback: gate on AUTH_ENABLED, reject non-relative next (open redirect) - page: snapshot + resume form and style image across the OAuth redirect; require login before the style-image vision parse - play: wire authResolveRef so login retries the action that hit 401; dismissing the modal no longer re-fires it - server: wrap cookie setAll in try/catch for read-only contexts Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-13 19:27:51 +08:00
yuanzonghao	0998f7c46a	feat(play): add error observability analytics for mobile diagnostics Track play_error and play_visibility_lost events via Umami to distinguish mobile vs desktop failure modes. Each error event captures orientation, connection type, visibility state, elapsed time bucket, and error classification — all categorical, no free text. Includes postJson "HTTP \d+" status parsing for the new engineClient dual-path architecture. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-13 18:57:38 +08:00
yuanzonghao	87a2f93edb	feat(auth): add Supabase auth with Google, GitHub, and email OTP login Introduce user registration/login gated behind optional NEXT_PUBLIC_SUPABASE_* env vars (leave blank to disable — app behaves exactly as before). Adds proxy.ts for automatic cookie session refresh, requireUser() API route guards on all 7 compute-consuming routes, AuthModal (Google/GitHub OAuth + 6-digit email OTP), UserChip header component, and login_success analytics event. Identity is fully decoupled from Session/engine — no type changes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-13 17:33:55 +08:00
yuanzonghao	e68e7e1690	feat(engine): add opt-in image timeout and scene-paint hedging IMAGE_TIMEOUT_MS sets a per-attempt hard deadline (AbortSignal.timeout); IMAGE_HEDGE_MS races a second identical scene-paint request when the first is still pending past the threshold. Both default to OFF when unset, preserving historical behavior for self-hosted deploys. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-13 11:21:47 +08:00
baizhi958216	c4ffc16498	Merge pull request #64 from zonghaoyuan/refactor/settings-modal feat: add client-side model configuration and server fallback	2026-06-12 22:09:43 +08:00
baizhi958216	5608b0fdd0	fix(engine): tolerate duplicated JSON outputs	2026-06-11 16:11:52 +08:00
baizhi958216	ef3b57953b	refactor(ai-client): replace AI SDK adapters with OpenAI SDK	2026-06-11 16:11:44 +08:00
baizhi958216	6cd7d88326	feat(web): fallback to server API routes when no client-side model config is set When a user has not configured their own model keys in localStorage, engine calls now automatically route through /api/* server routes instead of throwing "模型配置未设置". This lets Vercel deploys with server-side environment variables work out of the box. - Add lib/engineClient.ts as a unified client-side routing layer: checks localStorage for BYO config, falls back to POST /api/start, /api/scene, /api/vision, /api/classify-freeform, /api/insert-beat - Update app/play/page.tsx to use engineClient instead of direct engine imports; remove buildEngineConfig() - Update app/page.tsx style-image parsing to also fall back to /api/parse-style-image when no local model config exists Signed-off-by: zhi <zhi@peropero.net>	2026-06-11 12:15:14 +08:00
baizhi958216	94973bc6c6	fix(tts): add non-null assertion in stepfun array access Signed-off-by: baizhi958216 <1475289190@qq.com>	2026-06-11 12:15:14 +08:00
baizhi958216	759319bf28	feat(config): extract STYLE_EXTRACTION_PROMPT to shared lib for client reuse Signed-off-by: baizhi958216 <1475289190@qq.com>	2026-06-11 12:15:13 +08:00
baizhi958216	a2dd5ad630	feat(config): add client-side model config storage and EngineConfig resolver Signed-off-by: baizhi958216 <1475289190@qq.com>	2026-06-11 12:15:13 +08:00
baizhi958216	2088bae311	fix(tts): replace Buffer.from with browser-compatible arrayBufferToBase64 in stepfun Signed-off-by: baizhi958216 <1475289190@qq.com>	2026-06-11 12:15:13 +08:00
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	621f83c47b	feat(web): embed beat audio into gallery and infiplot exports Walk every speaking beat at export time, reuse current scene's beatAudioMap, and synth the rest via BYO TTS or /api/beat-audio with concurrency 4. Show a progress toast on the play page while collecting. Gallery export keeps audio in a sidecar localStorage key so the first paint is not blocked by JSON.parse-ing several MB of base64; the gallery lazy-loads it after the first scene image, then plays per-beat audio with a mute toggle persisted to localStorage. .infiplot share files embed audioByBeatId in the doc itself (v2); on import the data URIs survive scene swaps and feed back into the per-beat audio map so replayers hear the original voices for free. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-11 09:29:16 +08:00
Zonghao Yuan	d15d53ba65	Merge pull request #57 from zonghaoyuan/feat/tts-stepfun-provider feat(tts): add StepFun preset-voice provider, route by URL + voice tag	2026-06-09 14:28:36 +08:00
yuanzonghao	1a6238f8b8	fix(tts): harden StepFun provider integration - Validate voice.provider against known whitelist (xiaomi\|stepfun) in beat-audio route to return a clear 400 instead of falling through - Move single-char pronouns (他/她) to weak-signal fallback in detectGender to avoid false positives on compounds like 其他 - Update .env.example with StepFun configuration examples Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-09 14:24:27 +08:00
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	04f22249c9	fix(tts): make stepfun preset pick case-stable and per-character - Hash the lowercased description (matching the case-insensitive scoring) so the same archetype text picks the same preset regardless of case. - Thread the character name through provisionVoice -> stepfunProvision as the hash salt, so two characters that share archetype keywords spread across the top-N candidate presets instead of collapsing on one voice. Xiaomi path is unaffected (voicedesign mints a unique clip per call).	2026-06-09 09:14:44 +08:00
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	19bbee16fe	feat(tts): add StepFun preset-voice provider, route by URL + voice tag Add StepFun step-tts-mini / step-tts-2 / stepaudio-2.5-tts as an alternate TTS provider alongside Xiaomi MiMo. Auto-detected from TTS_BASE_URL host (contains `stepfun.com` → StepFun; otherwise → MiMo), mirroring how the image client infers Runware from `*.runware.ai`. CharacterVoice becomes a discriminated union on `provider`: - xiaomi: { referenceAudioBase64, mimeType } — unchanged - stepfun: { voiceId, model, mimeType } — preset voice ID + chosen model Provision dispatches on the current cfg's base URL; synthesis dispatches on the voice's own `provider` tag so a session with mixed voices (e.g. a provider switch mid-development) routes each beat through the correct protocol. xiaomiSynthesize now guards against being called with a non- xiaomi voice, surfacing the bug as a clear runtime error instead of a TypeScript narrow violation at the access site. StepFun has no voicedesign equivalent — only preset voices + voice cloning from a reference audio upload. Cloning would require an extra asset per character, so v1 maps the LLM's Chinese voiceDescription to one of the 32 published preset IDs via gender + age + tone keyword scoring, with a deterministic hash spread across the top-3 candidates so multiple characters with similar descriptions don't collapse onto the identical preset. lineDelivery is accepted but not yet propagated to StepFun's voice_label.emotion / .style fields — left as a follow-up. beat-audio route validation relaxed from `voice.referenceAudioBase64` (xiaomi-shaped) to `voice.provider` (shape-agnostic), so stepfun voices pass the gate; provider-specific shape errors still surface from the synth function. Observed latency on InfiPlot's dev loop: StepFun step-tts-mini median ~2.3s per beat with 0% timeouts across the test session, vs MiMo's median ~8s with the long tail tripping the existing 15s synth budget on roughly 2 of 3 beats. Pricing: step-tts-mini ¥0.9/万字符 (~¥0.14 per typical 50-beat session) vs MiMo TTS currently free under the Token Plan creator incentive. AGENTS.md provider matrix updated to describe both providers and the discriminated-union dispatch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-08 17:15:02 +08:00
Qi Chen	fc62c9edf5	feat(engine): tighten CharacterDesigner prompt to prevent look-alike … (#56 ) * feat(engine): tighten CharacterDesigner prompt to prevent look-alike characters Expand the visualDescription rules into a 6-element mandatory checklist (hair quad / eyes triad / face & build / outfit quad / personality-driven vibe / silhouette tag) and add an explicit anti-collision rule comparing against the existing cast across cross-color-family and cross-silhouette dimensions. Also upgrade the user-message "已设定角色" block from soft hint to hard constraint with an explicit pre-write scan step, nudging the LLM into chain- of-thought differentiation before emitting tags. All additions land in the session-stable system prefix, so prompt cache absorbs the extra tokens — per-call billed token delta is ~0. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(engine): replace pose examples with aura descriptors in personality vibe The PERSONALITY-DRIVEN VIBE element listed concrete poses (arms crossed, chin tilted up, slight slouch) which contradicted the earlier rule banning transient poses from visualDescription. Switch to pure atmosphere/aura keywords so the character card stays pose-neutral. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com> Co-authored-by: yuanzonghao <yuanzonghao123@gmail.com>	2026-06-08 16:27:15 +08:00
yuanzonghao	75548ce005	Merge pull request #52 from zonghaoyuan/feat/story-share feat(play): add encrypted story sharing with replay	2026-06-08 09:57:16 +08:00
yuanzonghao	39a7269494	fix(share): harden story share and relocate import button - Add Content-Length pre-check to story-pack and story-unpack routes to reject oversized payloads before buffering the body - Suppress internal error details in story-unpack catch (was leaking e.message to the client) - Strengthen sceneIndex validation: require non-negative integer - Guard against undefined storyState when replaying shared stories - Fix prefetch regression: remove currentBeat?.id from useEffect deps that was re-triggering all change-scene prefetches on every beat - Fix double detach: use else-if so the second replay detach guard doesn't fire redundantly after the first already detached - Align client file-size limit by format (.json 12MB, .infiplot 13MB) - Move "载入剧情" import button next to "开始" with hover tooltip Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-08 08:46:05 +08:00
yuanzonghao	867c52c24f	fix(gallery): address review findings in zip download module - Handle downloadImagesAsZip return value and surface errors to user - Fix inferImageExtension garbage output for data URIs without semicolons - Scale blob URL revocation delay for large zip files (>5MB → 60s) - Cap uniqueZipPath dedup loop at 10k iterations with timestamp fallback - Support relative URLs in inferImageExtension via base URL - Handle svg+xml MIME subtype correctly Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 22:32:23 +08:00
baizhi958216	0abd5f1525	feat(play): add encrypted story sharing	2026-06-07 17:13:27 +08:00
baizhi958216	7925e9c459	feat(gallery): download scene gallery as zip Signed-off-by: baizhi958216 <1475289190@qq.com>	2026-06-07 15:45:46 +08:00
yuanzonghao	4972243a93	fix: address PR Agent review findings across 6 files Restrict PR Agent workflow to trusted collaborators on PR comments only, fix UTF-8 byte counting in gallery-pack, correct portrait-to-landscape fallback orientation, track inserted freeform beats in visitedBeatIds, allow clearing stored TTS key, and guard empty-string fuzzy match in style selector. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 14:40:37 +08:00
yuanzonghao	53868471c6	feat(web): add 14 new art styles with thumbnails and reorder style grid Add 14 new painting styles sourced from preset story card generation scripts: Dunhuang fresco, Persian miniature, Byzantine mosaic, stained glass, vaporwave, vector illustration, low poly, pop art, glitch art, papercut, steampunk, xianxia fantasy, dark fairytale, and urban fantasy. Reorder all 36 styles into logical visual categories (anime → cinematic → Eastern traditional → Western traditional → genre → digital → handcraft) for easier browsing. Update "auto" thumbnail to a 3×3 composite grid and "custom" thumbnail to a paintbrush-on-canvas concept image. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 12:56:54 +08:00
yuanzonghao	ae3dd17e6b	feat(web): add player name, freeform input, and unified settings modal - Player name: stored in localStorage, injected into Architect/Writer/InsertBeat prompts so NPCs address the player by name, displayed in dialogue UI - Freeform input: compact button at choice nodes expands to text input, LLM classifier routes to insert-beat (interactive NPC response) or change-scene - SettingsModal: unified panel merging player name, voice toggle (with collapsible TTS key section), replacing the old TtsKeyModal - Insert-beat upgrade: prompt now requires NPC reaction when characters are present, shared by both freeform and Vision paths - IME guard: isComposing check on freeform input to prevent CJK mid-composition submission Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 12:37:50 +08:00
$DESKTOP-I1T6TF3\Q$ DESKTOP-I1T6TF3\Q	b0b5630a25	feat(web): export interactive gallery + encrypted share file Adds a "导出图集" action at the bottom-right of the play canvas that snapshots the current session into localStorage and opens /gallery#id=<id> in a new tab — the original play page keeps running untouched. In parallel, sends the doc to /api/gallery-pack and downloads the result as a binary .infiplot file the player can send to a friend. The snapshot pulls in: - Every visited scene's image + beat graph + recorded visit trail - All AI-prefetched alternate scenes (a new resolvedPrefetchesRef in PlayInner captures each prefetch as it resolves, so abandoned branches the engine already paid to generate are kept) - Character names + basePortraitUrl (voice base64 / styleReference are stripped — they aren't needed for replay) /gallery is a no-network interactive replay: - Per-beat advance and per-choice navigation. Picked choices are highlighted; unpicked choices are clickable when an alternate was prefetched, greyed otherwise. - Stack-based navigation for stepping into branches with one-tap "返回主线" to collapse back to the main path. - Top-bar batch download for scene images (including unique AI-prefetched branch scenes, deduped against the main path) and character portraits. Fetched with a per-file AbortController + 20s timeout in a small concurrency pool, then clicked serially. Prevents one slow CDN response from stranding the busy button. - In-progress hint banner reminding the player to allow the browser's "multiple downloads" prompt. - F-key fullscreen with a top toolbar that auto-retracts after the initial glance and pops back down on cursor approach. - Per-scene dialogue panel (fa-clock-rotate-left, matching the in-game history affordance). - "导入分享文件" entry on the empty/error state — accepts a friend's .infiplot, posts to /api/gallery-unpack, renders the decrypted doc. Share-file format (.infiplot): - AES-256-GCM via Web Crypto (portable to Cloudflare Workers). - Layout: 4-byte magic "IFPL" + 1-byte version + 12-byte nonce + ciphertext (includes 16-byte auth tag). - Key derived from GALLERY_SECRET via SHA-256. - GCM's auth tag gives tamper-detection for free; any flip in the ciphertext/nonce surfaces as "文件校验失败" — same error as wrong-key, so the distinction can't leak server config. - Stateless: server keeps no record of issued files. - GALLERY_SECRET unset → /api/gallery-pack returns 503, the play page silently skips the share-file download, local view still works. Rotating the secret invalidates every previously-issued file. Retention: trimGalleryExports keeps only the 2 most recent localStorage docs; older ones are evicted before each write so quota stays flat regardless of how many times the player exports. Share files live on the player's own disk — no retention concern. Adds 'gallery_export' to the analytics event schema (scene_count only — no free text). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-07 12:08:37 +08:00
yuanzonghao	f4aca0b59c	refactor(ai-client): extract shared createLanguageModel helper De-duplicate the provider switch logic that was identical in chat.ts and vision.ts into a shared model.ts module. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 11:55:55 +08:00
yuanzonghao	57bc6556ab	refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText Eliminate the dual code path (raw fetch vs AI SDK) for text and vision. All providers now go through createLanguageModel() + generateText(), removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage type, summarizeUsage, and responseFormat plumbing from 8 call sites. Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses); DeepSeek only supports Chat Completions, so we use .chat() explicitly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-07 00:31:36 +08:00
yuanzonghao	165dcbc5e6	fix(engine): prevent Architect from seeing literal "auto" styleGuide Replace session.styleGuide with a descriptive placeholder before the Architect runs, so its prompt reads a natural sentence instead of the raw "auto" marker. Also wrap selectStyle in a try-catch so a transient LLM failure falls back to 吉卜力 instead of crashing session start. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-06 22:28:44 +08:00
yuanzonghao	585f302908	feat(engine): auto-select art style via parallel LLM call When user picks "自动", the client sends styleGuide="auto" to the server. The orchestrator then runs a lightweight style-selector LLM call in parallel with the Architect — both only depend on worldSetting, so there is zero added latency. The selector picks the best-matching preset from STYLE_MAP based on genre, mood, and setting. Also moves STYLE_MAP from page.tsx to lib/options.ts so it can be shared between client and server. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-06 22:08:08 +08:00
yuanzonghao	31ce3f1d40	feat(web): revamp style modal UI with grid cards, thumbnails, and dual-view Redesign the painting-style picker inspired by Pollo AI: widen modal to 1400px, show styles as square thumbnail cards in a 4-column grid with name labels below, add ember glow hover effect, and split custom-style editing into its own view. Simplify style names (e.g. "京阿尼细腻日常" → "京阿尼"), add 22 .webp preview thumbnails, and remove the per-preset override mechanism in favor of a cleaner grid + custom flow. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-06 20:45:08 +08:00
yuanzonghao	d646ce8db8	refactor(web): remove client-side BYO API key feature The BYO (Bring Your Own) API key configuration for LLM and image generation will be re-implemented via Cloudflare Workers. Remove the client-side implementation to prepare for that migration. TTS (text-to-speech) BYO key support is intentionally preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-06 17:42:00 +08:00
Zonghao Yuan	c30d11d60b	fix(security): harden BYO API header against SSRF and input abuse (#33 ) * fix(security): harden BYO API header against SSRF and input abuse - Add lib/validateUrl.ts with HTTPS-only + public-IP enforcement, provider allowlist, IPv6 rejection, and userinfo-in-URL blocking. - Add lib/byoHeaders.ts — single source of truth for client-side BYO header construction (deduplicates app/page.tsx & app/play/page.tsx). - config.ts: validate BYO endpoints via isPublicUrl(), cap header at 2 KB, truncate apiKey/model strings, sanitize log output. - fetchWithRetry: default redirect to "manual" to block 302-to-intranet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(security): address Copilot review — trim endpoint, strip control chars, drop unused import - safeEndpoint: trim whitespace before URL validation - safeString: strip ASCII control characters to prevent header injection - play/page.tsx: remove unused BYO_STORAGE_KEY import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-06-05 00:23:35 +08:00
yuanzonghao	9fc83de276	feat(web,engine): portrait-orientation scene images for mobile full-bleed Thread orientation (portrait\|landscape) from client through API, engine, and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes; desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image coordinate mapping, session-locked orientation, a shared coerceOrientation helper, and a choices overflow cap in portrait. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-04 17:30:54 +08:00
yuanzonghao	865bf322e9	fix(ai-client): parse Runware host by hostname; doc nits - inferImageProtocol: match runware.ai by parsed hostname (exact match or subdomain) instead of a bare substring, so notrunware.ai / runware.ai.evil.com no longer misroute to the Runware protocol - README: document the image-2-vip → OpenAI-compatible exception; correct the Imagen wording (deprecated, EOL 2026-06-24 — not yet discontinued) Addresses Copilot review on #30. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-04 17:09:05 +08:00
yuanzonghao	83fd5717e7	feat(ai-client): multi-provider compat — native Anthropic/Google + URL tolerance - TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK, selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible) - IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK alongside the existing Runware task-array and OpenAI-compatible REST paths - normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions); append the per-protocol version segment only for bare hosts - config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider? - deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-04 17:09:05 +08:00
yuanzonghao	b0b2e922d3	feat(web): optional bring-your-own Xiaomi MiMo TTS key (browser-side synthesis) Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits cause silent playback under concurrency. This adds an OPTIONAL path: a user can store their own Xiaomi MiMo key in the browser and synthesize voice client-side against Xiaomi's CORS-open endpoints. The key lives only in localStorage and is never sent to or logged by our server; the shared server key still serves everyone who does not opt in. - components/TtsKeyModal.tsx: shared key modal (key-family + region picker), reused by both the home and play pages - app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens in place instead of redirecting to the home page - app/page.tsx: home page consumes the shared modal + readStoredTtsConfig - lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets - app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update - docs/xiaomi-tts-key.md + README note Verified with tsc --noEmit (exit 0). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-04 16:58:55 +08:00

1 2

62 Commits