Commit Graph

300 Commits

Author SHA1 Message Date
yuanzonghao ae3dd17e6b feat(web): add player name, freeform input, and unified settings modal
- Player name: stored in localStorage, injected into Architect/Writer/InsertBeat
  prompts so NPCs address the player by name, displayed in dialogue UI
- Freeform input: compact button at choice nodes expands to text input, LLM
  classifier routes to insert-beat (interactive NPC response) or change-scene
- SettingsModal: unified panel merging player name, voice toggle (with
  collapsible TTS key section), replacing the old TtsKeyModal
- Insert-beat upgrade: prompt now requires NPC reaction when characters are
  present, shared by both freeform and Vision paths
- IME guard: isComposing check on freeform input to prevent CJK mid-composition
  submission

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 12:37:50 +08:00
DESKTOP-I1T6TF3\Q b0b5630a25 feat(web): export interactive gallery + encrypted share file
Adds a "导出图集" action at the bottom-right of the play canvas that
snapshots the current session into localStorage and opens
/gallery#id=<id> in a new tab — the original play page keeps running
untouched. In parallel, sends the doc to /api/gallery-pack and
downloads the result as a binary .infiplot file the player can send
to a friend.

The snapshot pulls in:
  - Every visited scene's image + beat graph + recorded visit trail
  - All AI-prefetched alternate scenes (a new resolvedPrefetchesRef in
    PlayInner captures each prefetch as it resolves, so abandoned
    branches the engine already paid to generate are kept)
  - Character names + basePortraitUrl (voice base64 / styleReference
    are stripped — they aren't needed for replay)

/gallery is a no-network interactive replay:
  - Per-beat advance and per-choice navigation. Picked choices are
    highlighted; unpicked choices are clickable when an alternate was
    prefetched, greyed otherwise.
  - Stack-based navigation for stepping into branches with one-tap
    "返回主线" to collapse back to the main path.
  - Top-bar batch download for scene images (including unique
    AI-prefetched branch scenes, deduped against the main path) and
    character portraits. Fetched with a per-file AbortController + 20s
    timeout in a small concurrency pool, then clicked serially.
    Prevents one slow CDN response from stranding the busy button.
  - In-progress hint banner reminding the player to allow the
    browser's "multiple downloads" prompt.
  - F-key fullscreen with a top toolbar that auto-retracts after the
    initial glance and pops back down on cursor approach.
  - Per-scene dialogue panel (fa-clock-rotate-left, matching the
    in-game history affordance).
  - "导入分享文件" entry on the empty/error state — accepts a friend's
    .infiplot, posts to /api/gallery-unpack, renders the decrypted doc.

Share-file format (.infiplot):
  - AES-256-GCM via Web Crypto (portable to Cloudflare Workers).
  - Layout: 4-byte magic "IFPL" + 1-byte version + 12-byte nonce +
    ciphertext (includes 16-byte auth tag).
  - Key derived from GALLERY_SECRET via SHA-256.
  - GCM's auth tag gives tamper-detection for free; any flip in the
    ciphertext/nonce surfaces as "文件校验失败" — same error as wrong-key,
    so the distinction can't leak server config.
  - Stateless: server keeps no record of issued files.
  - GALLERY_SECRET unset → /api/gallery-pack returns 503, the play page
    silently skips the share-file download, local view still works.
    Rotating the secret invalidates every previously-issued file.

Retention: trimGalleryExports keeps only the 2 most recent localStorage
docs; older ones are evicted before each write so quota stays flat
regardless of how many times the player exports. Share files live on
the player's own disk — no retention concern.

Adds 'gallery_export' to the analytics event schema (scene_count only —
no free text).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-07 12:08:37 +08:00
Zonghao Yuan 5acffb6f85 Merge pull request #43 from zonghaoyuan/worktree-ai-sdk-migration
refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText
2026-06-07 12:04:47 +08:00
yuanzonghao 57b3ac78cd feat(ci): add dual-model PR Agent for automated code review
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 12:02:17 +08:00
yuanzonghao f4aca0b59c refactor(ai-client): extract shared createLanguageModel helper
De-duplicate the provider switch logic that was identical in chat.ts
and vision.ts into a shared model.ts module.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 11:55:55 +08:00
yuanzonghao 57bc6556ab refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText
Eliminate the dual code path (raw fetch vs AI SDK) for text and vision.
All providers now go through createLanguageModel() + generateText(),
removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage
type, summarizeUsage, and responseFormat plumbing from 8 call sites.

Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses);
DeepSeek only supports Chat Completions, so we use .chat() explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 00:31:36 +08:00
Zonghao Yuan 0cf7246935 Merge pull request #42 from zonghaoyuan/worktree-portrait-card-support
feat(web): support portrait preset story cards on mobile
2026-06-07 00:28:39 +08:00
yuanzonghao 95a66d94ed feat(web): support portrait preset story cards on mobile
Mobile users clicking preset story cards now get portrait (9:16) scene
images instead of landscape. Previously card paths hardcoded orientation
to "landscape"; now they respect detectOrientation() and load from
firstact-portrait/ with graceful fallback to landscape.

- Add --portrait and --only flags to prebake-firstacts.mjs
- Add --portrait flag to localize-firstact-images.mjs
- Fix prebake STYLE_MAP extraction (moved to lib/options.ts)
- Generate 60 portrait firstact JSONs + firstscene webp assets
- Remove hardcoded "landscape" in play page card path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 00:12:37 +08:00
Zonghao Yuan 04b869eed0 Merge pull request #40 from zonghaoyuan/feat/gal-history
feat(dx): add missing nextjs default rules
2026-06-06 23:36:34 +08:00
Zonghao Yuan 60e324c3b6 Merge pull request #38 from zonghaoyuan/worktree-style-modal-revamp
feat(web): revamp style modal with grid cards, optimized prompts, and polished custom view
2026-06-06 22:59:30 +08:00
Zonghao Yuan feda563b51 Merge pull request #41 from zonghaoyuan/fix/ime-enter-key
fix(web): prevent Enter key from firing during IME composition
2026-06-06 22:57:04 +08:00
yuanzonghao e2cb28ddb9 fix(web): prevent Enter key from firing during IME composition
Add isComposing guard to the homepage prompt textarea so CJK users
no longer accidentally submit while composing. Also show a subtle
"Enter 发送 · Shift+Enter 换行" hint when the input has content.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 22:42:03 +08:00
yuanzonghao 165dcbc5e6 fix(engine): prevent Architect from seeing literal "auto" styleGuide
Replace session.styleGuide with a descriptive placeholder before the
Architect runs, so its prompt reads a natural sentence instead of the
raw "auto" marker. Also wrap selectStyle in a try-catch so a transient
LLM failure falls back to 吉卜力 instead of crashing session start.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 22:28:44 +08:00
yuanzonghao 585f302908 feat(engine): auto-select art style via parallel LLM call
When user picks "自动", the client sends styleGuide="auto" to the
server. The orchestrator then runs a lightweight style-selector LLM
call in parallel with the Architect — both only depend on worldSetting,
so there is zero added latency. The selector picks the best-matching
preset from STYLE_MAP based on genre, mood, and setting.

Also moves STYLE_MAP from page.tsx to lib/options.ts so it can be
shared between client and server.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 22:08:08 +08:00
Zonghao Yuan 2971423bdd Merge pull request #39 from zonghaoyuan/worktree-docker-compose-deploy
feat(deploy): add Docker Compose self-hosted deployment
2026-06-06 22:04:14 +08:00
baizhi958216 1bdd4dfd13 feat(dx): add missing nextjs default rules
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-06 21:58:04 +08:00
yuanzonghao c82f887a02 feat(deploy): add Docker Compose self-hosted deployment option
Add multi-platform Docker image build (amd64 + arm64) with GitHub Actions
CI that pushes to GHCR on every merge to main. Users can self-host with
a single `docker compose up -d` command.

- Dockerfile: multi-stage build with Next.js standalone output (~150-200MB)
- docker-compose.yml: one-command self-hosted deployment
- .github/workflows/docker.yml: CI workflow with QEMU cross-compilation
- next.config.ts: conditional `output: "standalone"` via BUILD_STANDALONE env
- README (zh/en/ja): restructure deploy section to include Docker option

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:57:57 +08:00
yuanzonghao 5fab22b9d5 fix(build): exclude scripts/ from tsconfig to fix Vercel build
The gen-style-thumbs.ts script uses Bun-only APIs (import.meta.dir,
Bun.write) which fail TypeScript checking under the project's Next.js
tsconfig. Exclude the scripts directory from compilation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:57:33 +08:00
yuanzonghao 9dfff39f88 fix(web): address review — remove unused var, add focus-visible, fix comment
- Remove unused `isAuto` variable after magic-wand button removal
- Add focus-visible ring to style cards for keyboard accessibility
- Update DEFAULT_STYLE comment to match actual fallback (吉卜力)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:55:21 +08:00
Zonghao Yuan e1f213288e Merge pull request #37 from zonghaoyuan/feat/gal-history
feat(play): add play history
2026-06-06 21:50:30 +08:00
yuanzonghao 9a1c292b77 feat(web): polish custom style view layout and UX
Rework custom style view: fixed modal height to match grid view, move
upload and preset-import controls to bottom toolbar alongside cancel/save,
textarea fills remaining space. Add bordered style to cancel button,
improve disabled save button visibility, remove per-card magic-wand
customize button, and add placeholder hint about English prompts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:47:10 +08:00
yuanzonghao 9794a5a329 fix(play): fix CLAUDE.md typo and dialogue history memo anti-pattern
- Fix @AGETNTS.md → @AGENTS.md typo in CLAUDE.md
- Remove ref read inside useMemo (React anti-pattern causing one-frame stale data)
- Simplify buildDialogueHistory to read visitedBeatIds directly from session.history,
  which also fixes incorrect scene-ID matching when the same ID appears multiple times

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:39:24 +08:00
yuanzonghao 7185f319a2 feat(web): optimize style prompts and regenerate thumbnails with LLM-chosen scenes
Rewrite all 20 STYLE_MAP prompts with precise art terminology (sfumato,
feibai, bokashi, broken-color, etc.) and richer color/texture descriptions.
KyoAni prompt now references Beyond the Boundary and Sound Euphonium;
Ghibli references Spirited Away and Howl's Moving Castle. Regenerate all
style thumbnails using a two-step pipeline: DeepSeek picks an optimal
visual-novel scene per style, then Runware renders it. Add cache-busting
query param (thumbV) to thumbnail URLs. Include gen-style-thumbs.ts script
for future regeneration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 21:23:28 +08:00
baizhi958216 5a7daa8452 feat(play): add history dialog
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-06 20:52:10 +08:00
yuanzonghao 31ce3f1d40 feat(web): revamp style modal UI with grid cards, thumbnails, and dual-view
Redesign the painting-style picker inspired by Pollo AI: widen modal to
1400px, show styles as square thumbnail cards in a 4-column grid with
name labels below, add ember glow hover effect, and split custom-style
editing into its own view. Simplify style names (e.g. "京阿尼细腻日常" →
"京阿尼"), add 22 .webp preview thumbnails, and remove the per-preset
override mechanism in favor of a cleaner grid + custom flow.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 20:45:08 +08:00
baizhi958216 aef4771d2e feat(dx): add agents.md
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-06 19:49:32 +08:00
Zonghao Yuan 8cfb2d2860 Merge pull request #36 from zonghaoyuan/staging
Release staging to production
2026-06-06 18:39:44 +08:00
yuanzonghao aed05a0512 fix(web): remove hardcoded maxDuration so Vercel dashboard setting takes effect
Code-level `export const maxDuration = 60` and vercel.json `functions`
block were overriding the dashboard's 300s setting, causing ~100 504
timeouts per day on /api/scene and /api/start. Removing them lets each
Vercel plan use its own default (60s Hobby, 300s Pro) without breaking
self-deployers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 18:18:09 +08:00
Zonghao Yuan 812b9a8973 Merge pull request #35 from zonghaoyuan/worktree-remove-byo-api
refactor(web): remove client-side BYO API key feature
2026-06-06 17:45:28 +08:00
yuanzonghao d646ce8db8 refactor(web): remove client-side BYO API key feature
The BYO (Bring Your Own) API key configuration for LLM and image
generation will be re-implemented via Cloudflare Workers. Remove
the client-side implementation to prepare for that migration.

TTS (text-to-speech) BYO key support is intentionally preserved.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-06 17:42:00 +08:00
Zonghao Yuan 3625f935ed Merge pull request #34 from zonghaoyuan/worktree-fix+fot-reduction
fix(web): reduce FOT by stripping redundant voice data from transport
2026-06-05 00:25:51 +08:00
yuanzonghao e88e988de3 fix(web): reduce FOT by stripping redundant voice data from transport
Three transport-only optimizations that cut per-session Vercel FOT by ~50-60%:

P0 — Server strips voice.referenceAudioBase64 from already-known characters
in /api/scene and /api/insert-beat responses (defense-in-depth).

P1 — Client strips all voice data from session before sending to
/api/scene, /api/vision, and /api/insert-beat. Voices are retained locally
and re-merged from responses via mergeCharactersPreserveVoice(). The engine
only needs character names + visualDescriptions for scene generation.

P3 — /api/beat-audio returns binary audio (Response with Content-Type)
instead of JSON-wrapped base64, saving ~33% encoding overhead. Client
converts to blob URLs; PlayCanvas accepts a single audioSrc prop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:24:34 +08:00
Zonghao Yuan c30d11d60b fix(security): harden BYO API header against SSRF and input abuse (#33)
* fix(security): harden BYO API header against SSRF and input abuse

- Add lib/validateUrl.ts with HTTPS-only + public-IP enforcement,
  provider allowlist, IPv6 rejection, and userinfo-in-URL blocking.
- Add lib/byoHeaders.ts — single source of truth for client-side BYO
  header construction (deduplicates app/page.tsx & app/play/page.tsx).
- config.ts: validate BYO endpoints via isPublicUrl(), cap header at
  2 KB, truncate apiKey/model strings, sanitize log output.
- fetchWithRetry: default redirect to "manual" to block 302-to-intranet.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(security): address Copilot review — trim endpoint, strip control chars, drop unused import

- safeEndpoint: trim whitespace before URL validation
- safeString: strip ASCII control characters to prevent header injection
- play/page.tsx: remove unused BYO_STORAGE_KEY import

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:23:35 +08:00
Zonghao Yuan bc8f47e601 Merge pull request #32 from zonghaoyuan/fix/tts-doc-guide
docs(tts): prioritize pay-as-you-go path + polish Chinese copy
2026-06-04 23:43:22 +08:00
yuanzonghao e6d60999ac docs(tts): prioritize pay-as-you-go path + polish Chinese copy
Rewrite docs/xiaomi-tts-key.md:
- Lead with the sk- (pay-as-you-go) key path as the recommended route,
  since most users don't have a Token Plan subscription.
- Add direct link to the console/api-keys page.
- Polish Chinese prose throughout for natural phrasing and clarity
  (replace jargon like "0x 计费" → "免费", "端点" → "服务地址", etc.).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-04 20:40:59 +08:00
Zonghao Yuan 4be980d8ee Merge pull request #31 from zonghaoyuan/feat/mobile-portrait-images
feat(web,engine): portrait-orientation scene images for mobile full-bleed
2026-06-04 18:13:51 +08:00
yuanzonghao ea207e103b fix(play): lock orientation pre-paint to avoid portrait loading flash
Set the session orientation in an isomorphic layout effect so portrait
phones don't flash the landscape loading chrome for a frame before the
bootstrap effect runs. State still inits to "landscape" for SSR-safety;
the correction now lands before first paint (no-op on landscape devices).

Addresses Copilot review on PR #31.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:30:55 +08:00
yuanzonghao 9fc83de276 feat(web,engine): portrait-orientation scene images for mobile full-bleed
Thread orientation (portrait|landscape) from client through API, engine,
and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes;
desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image
coordinate mapping, session-locked orientation, a shared coerceOrientation
helper, and a choices overflow cap in portrait.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:30:54 +08:00
Zonghao Yuan 77f5296e18 Merge pull request #30 from zonghaoyuan/feat/multi-provider-compat
feat(ai-client): multi-provider compat — native Anthropic/Google
2026-06-04 17:10:35 +08:00
yuanzonghao 865bf322e9 fix(ai-client): parse Runware host by hostname; doc nits
- inferImageProtocol: match runware.ai by parsed hostname (exact match or
  subdomain) instead of a bare substring, so notrunware.ai /
  runware.ai.evil.com no longer misroute to the Runware protocol
- README: document the image-2-vip → OpenAI-compatible exception; correct the
  Imagen wording (deprecated, EOL 2026-06-24 — not yet discontinued)

Addresses Copilot review on #30.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
yuanzonghao 83fd5717e7 feat(ai-client): multi-provider compat — native Anthropic/Google + URL tolerance
- TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK,
  selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible)
- IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK
  alongside the existing Runware task-array and OpenAI-compatible REST paths
- normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions);
  append the per-protocol version segment only for bare hosts
- config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider?
- deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
Zonghao Yuan a4dc57a1b6 Merge pull request #28 from zonghaoyuan/feat/byo-tts-key
feat(web): optional bring-your-own Xiaomi MiMo TTS key
2026-06-04 17:00:42 +08:00
yuanzonghao f6226facbd fix(web): address PR #28 review — explicit clientTts boolean + BYO key prefix hint
Harden the BYO-mode signal at the API boundary (start/scene/insert-beat):
only clientTts === true drops server TTS, so a stray truthy non-boolean can't
silently disable it. Add a non-blocking prefix hint in TtsKeyModal that warns
when the pasted key prefix (tp-/sk-) mismatches the selected key type — a
mismatch hits the wrong endpoint and plays silently, the symptom BYO fixes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
yuanzonghao b0b2e922d3 feat(web): optional bring-your-own Xiaomi MiMo TTS key (browser-side synthesis)
Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits
cause silent playback under concurrency. This adds an OPTIONAL path: a user
can store their own Xiaomi MiMo key in the browser and synthesize voice
client-side against Xiaomi's CORS-open endpoints. The key lives only in
localStorage and is never sent to or logged by our server; the shared server
key still serves everyone who does not opt in.

- components/TtsKeyModal.tsx: shared key modal (key-family + region picker),
  reused by both the home and play pages
- app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens
  in place instead of redirecting to the home page
- app/page.tsx: home page consumes the shared modal + readStoredTtsConfig
- lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets
- app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update
- docs/xiaomi-tts-key.md + README note

Verified with tsc --noEmit (exit 0).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 16:58:55 +08:00
Zonghao Yuan 24b674d792 Merge pull request #27 from zonghaoyuan/perf/writer-split
perf(engine): split Writer into Phase A (plan) + Phase B (beats)
2026-06-04 16:53:21 +08:00
yuanzonghao efe021d886 fix(engine): pin entry-beat roster to the plan in Phase B
The Painter composites exactly plan.entryActiveCharacters into the entry
frame (the same roster the Cinematographer framed). Phase B is told to
reuse that roster, but only the entry beat's id was code-enforced — so an
LLM slip could leave a character in the painted frame that the runtime
entry beat says isn't there. Pin activeCharacters onto the plan's entry
beat as a last line of defense, mirroring the existing id pin.

Speaker is intentionally left to the prompt: it's coupled to line/TTS, so
overwriting it could mis-attribute or orphan Phase B's dialogue.

Addresses Copilot review feedback on PR #27.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 15:48:14 +08:00
DESKTOP-I1T6TF3\Q 592c82816a Revert "feat(loading): support typewriter story teaser during first scene generation"
This reverts commit 4e4e06ec8a.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 587e1e4e7d Revert "fix(loading): use left-aligned text for typewriter teaser to prevent jitter"
This reverts commit e875ac8fd7.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q 3f45cd4e0f Revert "fix(loading): set w-full on teaser container to prevent horizontal shifting on first line"
This reverts commit 68999aca2a.
2026-06-04 15:13:03 +08:00
DESKTOP-I1T6TF3\Q d19baa2127 Revert "feat(loading): hide footer text when teaser appears and apply pulse animation to teaser text when typing completes"
This reverts commit 5e1a4656ed.
2026-06-04 15:13:03 +08:00