Commit Graph

124 Commits

Author SHA1 Message Date
Zonghao Yuan 8cfb2d2860 Merge pull request #36 from zonghaoyuan/staging
Release staging to production
2026-06-06 18:39:44 +08:00
yuanzonghao aed05a0512 fix(web): remove hardcoded maxDuration so Vercel dashboard setting takes effect
Code-level `export const maxDuration = 60` and vercel.json `functions`
block were overriding the dashboard's 300s setting, causing ~100 504
timeouts per day on /api/scene and /api/start. Removing them lets each
Vercel plan use its own default (60s Hobby, 300s Pro) without breaking
self-deployers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 18:18:09 +08:00
Zonghao Yuan 812b9a8973 Merge pull request #35 from zonghaoyuan/worktree-remove-byo-api
refactor(web): remove client-side BYO API key feature
2026-06-06 17:45:28 +08:00
yuanzonghao d646ce8db8 refactor(web): remove client-side BYO API key feature
The BYO (Bring Your Own) API key configuration for LLM and image
generation will be re-implemented via Cloudflare Workers. Remove
the client-side implementation to prepare for that migration.

TTS (text-to-speech) BYO key support is intentionally preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 17:42:00 +08:00
Zonghao Yuan 3625f935ed Merge pull request #34 from zonghaoyuan/worktree-fix+fot-reduction
fix(web): reduce FOT by stripping redundant voice data from transport
2026-06-05 00:25:51 +08:00
yuanzonghao e88e988de3 fix(web): reduce FOT by stripping redundant voice data from transport
Three transport-only optimizations that cut per-session Vercel FOT by ~50-60%:

P0 — Server strips voice.referenceAudioBase64 from already-known characters
in /api/scene and /api/insert-beat responses (defense-in-depth).

P1 — Client strips all voice data from session before sending to
/api/scene, /api/vision, and /api/insert-beat. Voices are retained locally
and re-merged from responses via mergeCharactersPreserveVoice(). The engine
only needs character names + visualDescriptions for scene generation.

P3 — /api/beat-audio returns binary audio (Response with Content-Type)
instead of JSON-wrapped base64, saving ~33% encoding overhead. Client
converts to blob URLs; PlayCanvas accepts a single audioSrc prop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:24:34 +08:00
Zonghao Yuan c30d11d60b fix(security): harden BYO API header against SSRF and input abuse (#33)
* fix(security): harden BYO API header against SSRF and input abuse

- Add lib/validateUrl.ts with HTTPS-only + public-IP enforcement,
  provider allowlist, IPv6 rejection, and userinfo-in-URL blocking.
- Add lib/byoHeaders.ts — single source of truth for client-side BYO
  header construction (deduplicates app/page.tsx & app/play/page.tsx).
- config.ts: validate BYO endpoints via isPublicUrl(), cap header at
  2 KB, truncate apiKey/model strings, sanitize log output.
- fetchWithRetry: default redirect to "manual" to block 302-to-intranet.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): address Copilot review — trim endpoint, strip control chars, drop unused import

- safeEndpoint: trim whitespace before URL validation
- safeString: strip ASCII control characters to prevent header injection
- play/page.tsx: remove unused BYO_STORAGE_KEY import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:23:35 +08:00
Zonghao Yuan bc8f47e601 Merge pull request #32 from zonghaoyuan/fix/tts-doc-guide
docs(tts): prioritize pay-as-you-go path + polish Chinese copy
2026-06-04 23:43:22 +08:00
yuanzonghao e6d60999ac docs(tts): prioritize pay-as-you-go path + polish Chinese copy
Rewrite docs/xiaomi-tts-key.md:
- Lead with the sk- (pay-as-you-go) key path as the recommended route,
  since most users don't have a Token Plan subscription.
- Add direct link to the console/api-keys page.
- Polish Chinese prose throughout for natural phrasing and clarity
  (replace jargon like "0x 计费" → "免费", "端点" → "服务地址", etc.).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-04 20:40:59 +08:00
Zonghao Yuan 4be980d8ee Merge pull request #31 from zonghaoyuan/feat/mobile-portrait-images
feat(web,engine): portrait-orientation scene images for mobile full-bleed
2026-06-04 18:13:51 +08:00
yuanzonghao ea207e103b fix(play): lock orientation pre-paint to avoid portrait loading flash
Set the session orientation in an isomorphic layout effect so portrait
phones don't flash the landscape loading chrome for a frame before the
bootstrap effect runs. State still inits to "landscape" for SSR-safety;
the correction now lands before first paint (no-op on landscape devices).

Addresses Copilot review on PR #31.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:30:55 +08:00
yuanzonghao 9fc83de276 feat(web,engine): portrait-orientation scene images for mobile full-bleed
Thread orientation (portrait|landscape) from client through API, engine,
and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes;
desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image
coordinate mapping, session-locked orientation, a shared coerceOrientation
helper, and a choices overflow cap in portrait.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:30:54 +08:00
Zonghao Yuan 77f5296e18 Merge pull request #30 from zonghaoyuan/feat/multi-provider-compat
feat(ai-client): multi-provider compat — native Anthropic/Google
2026-06-04 17:10:35 +08:00
yuanzonghao 865bf322e9 fix(ai-client): parse Runware host by hostname; doc nits
- inferImageProtocol: match runware.ai by parsed hostname (exact match or
  subdomain) instead of a bare substring, so notrunware.ai /
  runware.ai.evil.com no longer misroute to the Runware protocol
- README: document the image-2-vip → OpenAI-compatible exception; correct the
  Imagen wording (deprecated, EOL 2026-06-24 — not yet discontinued)

Addresses Copilot review on #30.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
yuanzonghao 83fd5717e7 feat(ai-client): multi-provider compat — native Anthropic/Google + URL tolerance
- TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK,
  selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible)
- IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK
  alongside the existing Runware task-array and OpenAI-compatible REST paths
- normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions);
  append the per-protocol version segment only for bare hosts
- config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider?
- deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
Zonghao Yuan a4dc57a1b6 Merge pull request #28 from zonghaoyuan/feat/byo-tts-key
feat(web): optional bring-your-own Xiaomi MiMo TTS key
2026-06-04 17:00:42 +08:00
yuanzonghao f6226facbd fix(web): address PR #28 review — explicit clientTts boolean + BYO key prefix hint
Harden the BYO-mode signal at the API boundary (start/scene/insert-beat):
only clientTts === true drops server TTS, so a stray truthy non-boolean can't
silently disable it. Add a non-blocking prefix hint in TtsKeyModal that warns
when the pasted key prefix (tp-/sk-) mismatches the selected key type — a
mismatch hits the wrong endpoint and plays silently, the symptom BYO fixes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
yuanzonghao b0b2e922d3 feat(web): optional bring-your-own Xiaomi MiMo TTS key (browser-side synthesis)
Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits
cause silent playback under concurrency. This adds an OPTIONAL path: a user
can store their own Xiaomi MiMo key in the browser and synthesize voice
client-side against Xiaomi's CORS-open endpoints. The key lives only in
localStorage and is never sent to or logged by our server; the shared server
key still serves everyone who does not opt in.

- components/TtsKeyModal.tsx: shared key modal (key-family + region picker),
  reused by both the home and play pages
- app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens
  in place instead of redirecting to the home page
- app/page.tsx: home page consumes the shared modal + readStoredTtsConfig
- lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets
- app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update
- docs/xiaomi-tts-key.md + README note

Verified with tsc --noEmit (exit 0).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
Zonghao Yuan 24b674d792 Merge pull request #27 from zonghaoyuan/perf/writer-split
perf(engine): split Writer into Phase A (plan) + Phase B (beats)
2026-06-04 16:53:21 +08:00
yuanzonghao efe021d886 fix(engine): pin entry-beat roster to the plan in Phase B
The Painter composites exactly plan.entryActiveCharacters into the entry
frame (the same roster the Cinematographer framed). Phase B is told to
reuse that roster, but only the entry beat's id was code-enforced — so an
LLM slip could leave a character in the painted frame that the runtime
entry beat says isn't there. Pin activeCharacters onto the plan's entry
beat as a last line of defense, mirroring the existing id pin.

Speaker is intentionally left to the prompt: it's coupled to line/TTS, so
overwriting it could mis-attribute or orphan Phase B's dialogue.

Addresses Copilot review feedback on PR #27.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 15:48:14 +08:00
DESKTOP-I1T6TF3\Q 592c82816a Revert "feat(loading): support typewriter story teaser during first scene generation"
This reverts commit 4e4e06ec8a.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 587e1e4e7d Revert "fix(loading): use left-aligned text for typewriter teaser to prevent jitter"
This reverts commit e875ac8fd7.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 3f45cd4e0f Revert "fix(loading): set w-full on teaser container to prevent horizontal shifting on first line"
This reverts commit 68999aca2a.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q d19baa2127 Revert "feat(loading): hide footer text when teaser appears and apply pulse animation to teaser text when typing completes"
This reverts commit 5e1a4656ed.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q a311c24f70 Revert "feat(loading): delay teaser slow-pulse animation by 1s after typewriter ends"
This reverts commit 1ac665ad88.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 589bb31416 Revert "feat(loading): slow down teaser typing speed to 65ms and change fallback text to " 请等待\"
This reverts commit 05d9060dc2.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q a1f3750b6f Revert "feat(loading): make teaser title pulse together with body"
This reverts commit 7164c05b4e.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q a00095df66 Revert "fix(image): try fetching image as a blob directly first to avoid progressive rendering"
This reverts commit 676c0f1af8.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 676c0f1af8 fix(image): try fetching image as a blob directly first to avoid progressive rendering 2026-06-04 15:08:39 +08:00
DESKTOP-I1T6TF3\Q 7164c05b4e feat(loading): make teaser title pulse together with body 2026-06-04 15:03:50 +08:00
DESKTOP-I1T6TF3\Q 05d9060dc2 feat(loading): slow down teaser typing speed to 65ms and change fallback text to " 请等待\ 2026-06-04 15:00:50 +08:00
DESKTOP-I1T6TF3\Q 1ac665ad88 feat(loading): delay teaser slow-pulse animation by 1s after typewriter ends 2026-06-04 14:58:57 +08:00
DESKTOP-I1T6TF3\Q 5e1a4656ed feat(loading): hide footer text when teaser appears and apply pulse animation to teaser text when typing completes 2026-06-04 14:56:06 +08:00
DESKTOP-I1T6TF3\Q 68999aca2a fix(loading): set w-full on teaser container to prevent horizontal shifting on first line 2026-06-04 14:51:12 +08:00
DESKTOP-I1T6TF3\Q e875ac8fd7 fix(loading): use left-aligned text for typewriter teaser to prevent jitter 2026-06-04 14:49:42 +08:00
DESKTOP-I1T6TF3\Q 4e4e06ec8a feat(loading): support typewriter story teaser during first scene generation 2026-06-04 14:40:35 +08:00
DESKTOP-I1T6TF3\Q e04c51e875 feat(api): support custom BYO API header override on client fetches and backend config 2026-06-04 13:49:46 +08:00
Zonghao Yuan 1b1d5ce1c5 Merge pull request #29 from zonghaoyuan/staging
Merge staging → main
2026-06-04 11:39:54 +08:00
Zonghao Yuan af155ac107 Merge pull request #24 from zonghaoyuan/fix/optional-image-proxy
fix(play): make scene-image proxy opt-in (default direct-connect)
2026-06-04 11:25:11 +08:00
yuanzonghao 3bf5c92841 perf(engine): split Writer into Phase A (plan) + Phase B (beats)
The Writer was the serial long pole: a single LLM call wrote the scene
skeleton AND the full beats[] graph before anything downstream could
start, so variable-length beat generation blew up tail latency.

Split it into two calls:
- Phase A (runWriterPlan): minimal skeleton the image pipeline needs
  (sceneSummary, sceneKey, entryBeatId, cast, entry roster, entry speaker).
  Serial, on the critical path, kept lightweight.
- Phase B (runWriterBeats): full beats[] + storyStatePatch, written to
  honor the plan. Launched immediately, overlaps the ENTIRE image pipeline
  (cards / cinematographer / portraits / painter), awaited last.

Critical path becomes PhaseA + max(imagePipeline, PhaseB), so the long
beat-writing is hidden behind image gen. A Phase B failure degrades to a
single playable beat synthesized from the plan.

Paired distinct-payload A/B (6 content-matched stories, baseline vs split):
- median end-to-end 42.6s -> 32.2s (-24%)
- mean 46.4s -> 33.1s (-29%)
- worst case 74.7s -> 37.6s (halved)
- no content regression: total Writer output tokens 12858 -> 13699

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 11:17:34 +08:00
Zonghao Yuan 8ebacbeb83 Merge pull request #26 from zonghaoyuan/feat/umami-events
feat(web): 隐私友好的 Umami 自定义埋点
2026-06-04 11:05:15 +08:00
yuanzonghao 4bc47d8210 fix(play): bound preloadImage decode by the timeout; clarify proxy env docs
Addresses two GitHub Copilot review comments on PR #24:

- preloadImage cleared the 20s timeout in onload, before awaiting
  img.decode(), leaving the decode phase unguarded — a hung decode could
  keep the promise pending forever and stall the play loop. Move
  clearTimeout into a single idempotent done() so the timeout stays armed
  through decode() too, matching the stated "timeouts resolve quietly"
  intent.

- .env.example said to leave BOTH proxy vars blank, but shipped
  NEXT_PUBLIC_IMAGE_PROXY_ALLOWED_HOSTS=im.runware.ai. Only
  NEXT_PUBLIC_IMAGE_PROXY_URL gates the feature; the allowlist is inert
  until the URL is set. Corrected the wording, kept the self-documenting
  default value.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 11:04:16 +08:00
yuanzonghao e095650944 refactor(web): enforce content-free Umami fields at compile time
Address the Copilot review on #26.

#1 The game_start / art_style_select payload fields were typed as bare
   `string`, so free text could still slip through despite the "content-free
   by construction" claim. Add lib/options.ts as the single source of truth
   for the selector option sets (`as const` → literal-union types), have the
   home OPTS render from those arrays, and type the analytics fields from the
   derived unions (gender/art_style/plot_style/pacing/style) plus a template
   type for `card`. Free text now fails to compile; no casts at call sites.

#2 The /play heartbeat scheduled its 30s interval unconditionally. Gate the
   effect on the same NEXT_PUBLIC_UMAMI_* env used for script injection, so
   nothing is scheduled when the tracker is off (visibility check kept — a
   hidden tab still never emits).

#3 choice_select no longer emits a -1 choice_index: skip the event when the
   index can't be resolved instead of polluting the index distribution.

Verified with tsc (exit 0) and a throwaway negative test: free text in any
of the six fields raises TS2322, valid enum/template values compile.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 10:59:31 +08:00
yuanzonghao 4bf05f6784 feat(web): add privacy-friendly Umami custom events
Instrument the play flow with 9 content-free custom events (game_start,
art_style_select, style_image_upload, scene_reached, choice_select,
vision_click, tts_toggle, fullscreen_toggle, play_heartbeat) to measure
retention, engagement depth and session duration.

Privacy is enforced by construction, not convention:
- lib/analytics.ts types each event with a discriminated union, so a
  payload has no slot for free text — prompts, world guides, uploaded
  images and vision output can never reach analytics (compile-time
  guarantee, not a comment).
- track() no-ops without window.umami and never throws into the app.
- coarse 30s heartbeat fires only while the tab is visible.
- script stays gated on NEXT_PUBLIC_UMAMI_* env (blank → no script),
  honours Do-Not-Track, and locks to an exact data-domains allowlist.
- one-line on-site disclosure with a link, shown only when tracking is on.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 10:14:08 +08:00
Zonghao Yuan 9f4dcc097b Merge pull request #25 from zonghaoyuan/perf/home-cache-headers
perf(web): pin /home/* assets to 1y immutable cache
2026-06-04 10:04:27 +08:00
yuanzonghao 1fbeea14e6 perf(web): pin /home/* assets to 1y immutable cache
Next.js serves /public files with `Cache-Control: public, max-age=0,
must-revalidate`, so the home covers + first-act JSON were re-fetched on
every visit. Verified against 30 days of Vercel metrics: /home/* alone was
~62% of Fast Data Transfer egress (5.42 GB) while the files total only
~31 MB — the same bytes re-downloaded hundreds of times.

Add a headers() rule scoping `public, max-age=31536000, immutable` to
/home/:path* only; other paths keep their defaults (verified /icon.svg
still returns no-cache). Filenames under /home are stable (covers fN/mN.webp,
first-act JSON by card name), so immutable is safe; if a first-act JSON is
ever re-baked under the same name, bump a query string or purge the cache.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 10:01:06 +08:00
yuanzonghao 4347e5bfdf fix(play): make scene-image proxy opt-in — default deployers connect direct
b805b1d routed every scene <img> through fetch → Blob → createObjectURL to
kill QUIC progressive-paint, but in doing so added an *unconditional*
dependency on a CORS-adding proxy. That breaks the default deployment:
im.runware.ai sends no Access-Control-Allow-Origin, so a direct
fetch().blob() throws and the scene image silently fails to load for anyone
who hasn't stood up the Cloudflare Worker.

Restore the pre-b805b1d behavior as the *default* and make the proxy
strictly opt-in:

  - Direct path (no env set): preloadImage() warms the HTTP cache + decodes,
    then <img> uses the original https://im.runware.ai URL — as before
    b805b1d. No fetch().blob(), no CORS dependency: a fresh clone just works.
  - Proxy path (NEXT_PUBLIC_IMAGE_PROXY_URL set): fetch the proxied URL →
    Blob → createObjectURL, exactly as b805b1d, gaining the QUIC-immune
    HTTP/2 edge + atomic paint.

shouldProxy(url) gates the two paths: proxy only when a base is configured
AND the host is in NEXT_PUBLIC_IMAGE_PROXY_ALLOWED_HOSTS (default
im.runware.ai). data: / non-http / unknown-host URLs always take the direct
path. blobUrlCache + revoke logic is unchanged and safe for both paths
(revoke is a no-op on non-blob: URLs).

The Cloudflare Worker moves out of this repo into a standalone, one-click-
deployable project (infiplot-image-proxy) so the optional infra isn't
carried by every clone; .env.example and the READMEs link to it.

restore: preloadImage() helper deleted by b805b1d
add:     NEXT_PUBLIC_IMAGE_PROXY_ALLOWED_HOSTS (default im.runware.ai)
remove:  worker/ (moved to standalone repo)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 09:57:29 +08:00
Zonghao Yuan b86a9507e3 Merge pull request #23 from zonghaoyuan/fix/play-card-click-no-vision
fix(play): story-card clicks no longer trigger vision
2026-06-04 09:34:31 +08:00
DESKTOP-I1T6TF3\Q 010239de44 fix(home): localize first-scene images — drop Runware URL TTL dependency
Card click flow now serves /home/firstscene/{name}.webp from Vercel static
hosting instead of fetching im.runware.ai/... — those URLs have a finite TTL
and would silently rot. Side benefit: backfilled the 18 stories that never had
a local webp (f14-f29, m14, m29), and refreshed the 44 stale webps left over
from a pre-prebake story batch so they actually match their cover art again.

Scope is scene.imageUrl only; characters[].basePortraitUrl still points at
Runware (painter consumes it server-side as referenceImages, where a local
public path won't resolve).

localize-firstact-images.mjs:
- skip the network when the local webp is already on disk (don't re-encode
  what's already correct)
- read imageUrlRemote as a fallback URL when imageUrl is already localized,
  so --force can refresh from the original Runware source
- also localize scene.imageUrl alongside the top-level imageUrl

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-04 09:34:12 +08:00
yuanzonghao a18b91c48c fix(play): story-card clicks no longer trigger vision
Symptom: on a choice beat, clicking the dialogue/narration card fired
the vision ("识图") flow instead of doing nothing. Picking an option with
fast clicks that landed on the card repeatedly kicked off the expensive
/api/vision → insert-beat/scene chain — janky and confusing.

Root cause: the story-card <div> had `pointer-events-none`, so clicks
passed through to the background <img> onClick (handleImageClick), which
on choice beats calls onBackgroundClick → vision.

Fix: the card now owns its clicks (`pointer-events-auto` + handleCardClick):
  - mid-typing   → completes the text (VN skip affordance, unchanged)
  - continue beat → advances, as before
  - choice beat  → no-op (no vision)
Clicking the actual scene art still triggers vision; choice buttons
already had pointer-events-auto and are unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 09:17:30 +08:00