- Player name: stored in localStorage, injected into Architect/Writer/InsertBeat
prompts so NPCs address the player by name, displayed in dialogue UI
- Freeform input: compact button at choice nodes expands to text input, LLM
classifier routes to insert-beat (interactive NPC response) or change-scene
- SettingsModal: unified panel merging player name, voice toggle (with
collapsible TTS key section), replacing the old TtsKeyModal
- Insert-beat upgrade: prompt now requires NPC reaction when characters are
present, shared by both freeform and Vision paths
- IME guard: isComposing check on freeform input to prevent CJK mid-composition
submission
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a "导出图集" action at the bottom-right of the play canvas that
snapshots the current session into localStorage and opens
/gallery#id=<id> in a new tab — the original play page keeps running
untouched. In parallel, sends the doc to /api/gallery-pack and
downloads the result as a binary .infiplot file the player can send
to a friend.
The snapshot pulls in:
- Every visited scene's image + beat graph + recorded visit trail
- All AI-prefetched alternate scenes (a new resolvedPrefetchesRef in
PlayInner captures each prefetch as it resolves, so abandoned
branches the engine already paid to generate are kept)
- Character names + basePortraitUrl (voice base64 / styleReference
are stripped — they aren't needed for replay)
/gallery is a no-network interactive replay:
- Per-beat advance and per-choice navigation. Picked choices are
highlighted; unpicked choices are clickable when an alternate was
prefetched, greyed otherwise.
- Stack-based navigation for stepping into branches with one-tap
"返回主线" to collapse back to the main path.
- Top-bar batch download for scene images (including unique
AI-prefetched branch scenes, deduped against the main path) and
character portraits. Fetched with a per-file AbortController + 20s
timeout in a small concurrency pool, then clicked serially.
Prevents one slow CDN response from stranding the busy button.
- In-progress hint banner reminding the player to allow the
browser's "multiple downloads" prompt.
- F-key fullscreen with a top toolbar that auto-retracts after the
initial glance and pops back down on cursor approach.
- Per-scene dialogue panel (fa-clock-rotate-left, matching the
in-game history affordance).
- "导入分享文件" entry on the empty/error state — accepts a friend's
.infiplot, posts to /api/gallery-unpack, renders the decrypted doc.
Share-file format (.infiplot):
- AES-256-GCM via Web Crypto (portable to Cloudflare Workers).
- Layout: 4-byte magic "IFPL" + 1-byte version + 12-byte nonce +
ciphertext (includes 16-byte auth tag).
- Key derived from GALLERY_SECRET via SHA-256.
- GCM's auth tag gives tamper-detection for free; any flip in the
ciphertext/nonce surfaces as "文件校验失败" — same error as wrong-key,
so the distinction can't leak server config.
- Stateless: server keeps no record of issued files.
- GALLERY_SECRET unset → /api/gallery-pack returns 503, the play page
silently skips the share-file download, local view still works.
Rotating the secret invalidates every previously-issued file.
Retention: trimGalleryExports keeps only the 2 most recent localStorage
docs; older ones are evicted before each write so quota stays flat
regardless of how many times the player exports. Share files live on
the player's own disk — no retention concern.
Adds 'gallery_export' to the analytics event schema (scene_count only —
no free text).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three transport-only optimizations that cut per-session Vercel FOT by ~50-60%:
P0 — Server strips voice.referenceAudioBase64 from already-known characters
in /api/scene and /api/insert-beat responses (defense-in-depth).
P1 — Client strips all voice data from session before sending to
/api/scene, /api/vision, and /api/insert-beat. Voices are retained locally
and re-merged from responses via mergeCharactersPreserveVoice(). The engine
only needs character names + visualDescriptions for scene generation.
P3 — /api/beat-audio returns binary audio (Response with Content-Type)
instead of JSON-wrapped base64, saving ~33% encoding overhead. Client
converts to blob URLs; PlayCanvas accepts a single audioSrc prop.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread orientation (portrait|landscape) from client through API, engine,
and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes;
desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image
coordinate mapping, session-locked orientation, a shared coerceOrientation
helper, and a choices overflow cap in portrait.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Harden the BYO-mode signal at the API boundary (start/scene/insert-beat):
only clientTts === true drops server TTS, so a stray truthy non-boolean can't
silently disable it. Add a non-blocking prefix hint in TtsKeyModal that warns
when the pasted key prefix (tp-/sk-) mismatches the selected key type — a
mismatch hits the wrong endpoint and plays silently, the symptom BYO fixes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits
cause silent playback under concurrency. This adds an OPTIONAL path: a user
can store their own Xiaomi MiMo key in the browser and synthesize voice
client-side against Xiaomi's CORS-open endpoints. The key lives only in
localStorage and is never sent to or logged by our server; the shared server
key still serves everyone who does not opt in.
- components/TtsKeyModal.tsx: shared key modal (key-family + region picker),
reused by both the home and play pages
- app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens
in place instead of redirecting to the home page
- app/page.tsx: home page consumes the shared modal + readStoredTtsConfig
- lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets
- app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update
- docs/xiaomi-tts-key.md + README note
Verified with tsc --noEmit (exit 0).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Instrument the play flow with 9 content-free custom events (game_start,
art_style_select, style_image_upload, scene_reached, choice_select,
vision_click, tts_toggle, fullscreen_toggle, play_heartbeat) to measure
retention, engagement depth and session duration.
Privacy is enforced by construction, not convention:
- lib/analytics.ts types each event with a discriminated union, so a
payload has no slot for free text — prompts, world guides, uploaded
images and vision output can never reach analytics (compile-time
guarantee, not a comment).
- track() no-ops without window.umami and never throws into the app.
- coarse 30s heartbeat fires only while the tab is visible.
- script stays gated on NEXT_PUBLIC_UMAMI_* env (blank → no script),
honours Do-Not-Track, and locks to an exact data-domains allowlist.
- one-line on-site disclosure with a link, shown only when tracking is on.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Symptom: on a choice beat, clicking the dialogue/narration card fired
the vision ("识图") flow instead of doing nothing. Picking an option with
fast clicks that landed on the card repeatedly kicked off the expensive
/api/vision → insert-beat/scene chain — janky and confusing.
Root cause: the story-card <div> had `pointer-events-none`, so clicks
passed through to the background <img> onClick (handleImageClick), which
on choice beats calls onBackgroundClick → vision.
Fix: the card now owns its clicks (`pointer-events-auto` + handleCardClick):
- mid-typing → completes the text (VN skip affordance, unchanged)
- continue beat → advances, as before
- choice beat → no-op (no vision)
Clicking the actual scene art still triggers vision; choice buttons
already had pointer-events-auto and are unaffected.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When the Runware CDN download was slow (~10-20s over VPN / strict
networks, vs. the optimistic <2s the existing comment assumed), the
preload's 8s timeout fired and setImageUrl committed before the bytes
were actually decoded. The rendered <img> has w-auto h-auto and no
intrinsic aspect-ratio source — until the image loads the layout
collapses to roughly 1px tall, giving the "等了很久 → 一根线 → 突然
出图" jank.
Two compounding fixes:
app/play/page.tsx IMAGE_PRELOAD_TIMEOUT_MS 8000 → 20000.
Real CDN+decode usually finishes well before
this; pushing the ceiling out just stops the
window where we commit a half-loaded URL.
components/PlayCanvas.tsx Add width={1792} height={1024} HTML attrs
to the scene <img>. Doesn't affect rendered
size (still driven by w-auto h-auto and the
maxWidth/maxHeight in sizeStyle); the
browser uses them purely as an intrinsic
aspect-ratio source, so the placeholder box
reserves a 16:9-ish frame even mid-download.
Together: slow networks now mostly wait through preload; on the rare
genuine timeout the layout still holds shape instead of collapsing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cookieless, env-gated page-view tracking via Umami. The <Analytics />
component injects the script only when NEXT_PUBLIC_UMAMI_SRC and
NEXT_PUBLIC_UMAMI_WEBSITE_ID are both set, so local dev and forks send
nothing to our instance. Adds .env.example docs (section 6) and a
homepage footer privacy disclosure. No Cookie consent banner needed.
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.
- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>