Commit Graph

300 Commits

Author SHA1 Message Date
baizhi958216 ef3b57953b refactor(ai-client): replace AI SDK adapters with OpenAI SDK 2026-06-11 16:11:44 +08:00
baizhi958216 6cd7d88326 feat(web): fallback to server API routes when no client-side model config is set
When a user has not configured their own model keys in localStorage,
engine calls now automatically route through /api/* server routes
instead of throwing "模型配置未设置". This lets Vercel deploys with
server-side environment variables work out of the box.

- Add lib/engineClient.ts as a unified client-side routing layer:
  checks localStorage for BYO config, falls back to POST /api/start,
  /api/scene, /api/vision, /api/classify-freeform, /api/insert-beat
- Update app/play/page.tsx to use engineClient instead of direct
  engine imports; remove buildEngineConfig()
- Update app/page.tsx style-image parsing to also fall back to
  /api/parse-style-image when no local model config exists

Signed-off-by: zhi <zhi@peropero.net>
2026-06-11 12:15:14 +08:00
baizhi958216 0f8e641c4c feat(web): merge SettingsModal and ModelSettingsModal with tab navigation
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 94973bc6c6 fix(tts): add non-null assertion in stepfun array access
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 b63b694940 refactor(play): use client-side engine API instead of direct fetch
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 ab2f42bc42 feat(web): merge TTS settings into ModelSettingsModal, remove from SettingsModal
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 6b11a225cd feat(web): add model settings button, modal, and client-side style image parsing
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 71216e1602 feat(ui): add ModelSettingsModal for configuring text/image/vision providers
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:14 +08:00
baizhi958216 759319bf28 feat(config): extract STYLE_EXTRACTION_PROMPT to shared lib for client reuse
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:13 +08:00
baizhi958216 a2dd5ad630 feat(config): add client-side model config storage and EngineConfig resolver
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:13 +08:00
baizhi958216 2088bae311 fix(tts): replace Buffer.from with browser-compatible arrayBufferToBase64 in stepfun
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-11 12:15:13 +08:00
Qi Chen e34306997a Merge pull request #63 from zonghaoyuan/feat/export-with-audio
feat(web): embed beat audio into gallery and infiplot exports
2026-06-11 09:36:42 +08:00
DESKTOP-I1T6TF3\Q 621f83c47b feat(web): embed beat audio into gallery and infiplot exports
Walk every speaking beat at export time, reuse current scene's beatAudioMap,
and synth the rest via BYO TTS or /api/beat-audio with concurrency 4. Show a
progress toast on the play page while collecting.

Gallery export keeps audio in a sidecar localStorage key so the first paint
is not blocked by JSON.parse-ing several MB of base64; the gallery lazy-loads
it after the first scene image, then plays per-beat audio with a mute toggle
persisted to localStorage. .infiplot share files embed audioByBeatId in the
doc itself (v2); on import the data URIs survive scene swaps and feed back
into the per-beat audio map so replayers hear the original voices for free.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-11 09:29:16 +08:00
Zonghao Yuan a61a91060d Merge pull request #62 from zonghaoyuan/feat/home-import-tooltip-infiplot
feat(web): clarify home import button tooltip as "载入infiplot剧情"
2026-06-10 00:18:06 +08:00
Zonghao Yuan ba3001329b Merge pull request #61 from zonghaoyuan/chore/style-thumb-kyoani-shinkai
chore(home): swap Kyoto Animation and Shinkai style thumbnails
2026-06-10 00:13:00 +08:00
DESKTOP-I1T6TF3\Q 1a50ed9fc4 chore(home): swap Kyoto Animation and Shinkai style thumbnails
Replace the auto-generated kyoani / shinkai style thumbnails with hand-picked
reference frames. Source PNGs were center-cropped to square and re-encoded as
512x512 WEBP (~41KB each) to match the existing thumbnail format. Bumps the
shared cache-buster from v5 to v6 so existing browsers fetch the new files.
2026-06-09 16:38:55 +08:00
DESKTOP-I1T6TF3\Q b72bbd5501 feat(web): clarify home import button tooltip as "载入infiplot剧情"
The home-page file-import button accepts .infiplot story files. The
tooltip now spells out the file type so users distinguish it from
"开始剧情"/"载入预设" affordances on the same screen.
2026-06-09 16:31:34 +08:00
Zonghao Yuan d15d53ba65 Merge pull request #57 from zonghaoyuan/feat/tts-stepfun-provider
feat(tts): add StepFun preset-voice provider, route by URL + voice tag
2026-06-09 14:28:36 +08:00
yuanzonghao 1a6238f8b8 fix(tts): harden StepFun provider integration
- Validate voice.provider against known whitelist (xiaomi|stepfun) in
  beat-audio route to return a clear 400 instead of falling through
- Move single-char pronouns (他/她) to weak-signal fallback in
  detectGender to avoid false positives on compounds like 其他
- Update .env.example with StepFun configuration examples

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-09 14:24:27 +08:00
Zonghao Yuan 11a49bbc30 Merge pull request #58 from zonghaoyuan/fix/gallery-pack
fix(share): remove infiplot file download event
2026-06-09 14:24:07 +08:00
DESKTOP-I1T6TF3\Q 04f22249c9 fix(tts): make stepfun preset pick case-stable and per-character
- Hash the lowercased description (matching the case-insensitive scoring)
  so the same archetype text picks the same preset regardless of case.
- Thread the character name through provisionVoice -> stepfunProvision as
  the hash salt, so two characters that share archetype keywords spread
  across the top-N candidate presets instead of collapsing on one voice.

Xiaomi path is unaffected (voicedesign mints a unique clip per call).
2026-06-09 09:14:44 +08:00
baizhi958216 24b97fa3fb chore(share): remove stale gallery pack code
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-08 19:30:04 +08:00
baizhi958216 1d12417cb0 fix(share): remove infiplot file download event before enter gallery page
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-08 19:24:40 +08:00
DESKTOP-I1T6TF3\Q 19bbee16fe feat(tts): add StepFun preset-voice provider, route by URL + voice tag
Add StepFun step-tts-mini / step-tts-2 / stepaudio-2.5-tts as an alternate
TTS provider alongside Xiaomi MiMo. Auto-detected from TTS_BASE_URL host
(contains `stepfun.com` → StepFun; otherwise → MiMo), mirroring how the
image client infers Runware from `*.runware.ai`.

CharacterVoice becomes a discriminated union on `provider`:
- xiaomi: { referenceAudioBase64, mimeType } — unchanged
- stepfun: { voiceId, model, mimeType } — preset voice ID + chosen model

Provision dispatches on the current cfg's base URL; synthesis dispatches
on the voice's own `provider` tag so a session with mixed voices (e.g. a
provider switch mid-development) routes each beat through the correct
protocol. xiaomiSynthesize now guards against being called with a non-
xiaomi voice, surfacing the bug as a clear runtime error instead of a
TypeScript narrow violation at the access site.

StepFun has no voicedesign equivalent — only preset voices + voice
cloning from a reference audio upload. Cloning would require an extra
asset per character, so v1 maps the LLM's Chinese voiceDescription to one
of the 32 published preset IDs via gender + age + tone keyword scoring,
with a deterministic hash spread across the top-3 candidates so multiple
characters with similar descriptions don't collapse onto the identical
preset. lineDelivery is accepted but not yet propagated to StepFun's
voice_label.emotion / .style fields — left as a follow-up.

beat-audio route validation relaxed from `voice.referenceAudioBase64`
(xiaomi-shaped) to `voice.provider` (shape-agnostic), so stepfun voices
pass the gate; provider-specific shape errors still surface from the
synth function.

Observed latency on InfiPlot's dev loop: StepFun step-tts-mini median
~2.3s per beat with 0% timeouts across the test session, vs MiMo's
median ~8s with the long tail tripping the existing 15s synth budget
on roughly 2 of 3 beats. Pricing: step-tts-mini ¥0.9/万字符 (~¥0.14
per typical 50-beat session) vs MiMo TTS currently free under the
Token Plan creator incentive.

AGENTS.md provider matrix updated to describe both providers and the
discriminated-union dispatch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-08 17:15:02 +08:00
Qi Chen fc62c9edf5 feat(engine): tighten CharacterDesigner prompt to prevent look-alike … (#56)
* feat(engine): tighten CharacterDesigner prompt to prevent look-alike characters

Expand the visualDescription rules into a 6-element mandatory checklist (hair
quad / eyes triad / face & build / outfit quad / personality-driven vibe /
silhouette tag) and add an explicit anti-collision rule comparing against the
existing cast across cross-color-family and cross-silhouette dimensions.

Also upgrade the user-message "已设定角色" block from soft hint to hard
constraint with an explicit pre-write scan step, nudging the LLM into chain-
of-thought differentiation before emitting tags.

All additions land in the session-stable system prefix, so prompt cache
absorbs the extra tokens — per-call billed token delta is ~0.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(engine): replace pose examples with aura descriptors in personality vibe

The PERSONALITY-DRIVEN VIBE element listed concrete poses (arms crossed,
chin tilted up, slight slouch) which contradicted the earlier rule
banning transient poses from visualDescription. Switch to pure
atmosphere/aura keywords so the character card stays pose-neutral.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: yuanzonghao <yuanzonghao123@gmail.com>
2026-06-08 16:27:15 +08:00
Zonghao Yuan cd6c004589 Merge pull request #55 from zonghaoyuan/staging
chore: sync staging to main
2026-06-08 15:47:59 +08:00
yuanzonghao 7c676fc43b fix(play): guard handleExportStory against duplicate clicks
Adds a ref-based mutex so concurrent /api/story-pack requests and
duplicate file downloads cannot be triggered by rapid clicking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-08 15:45:36 +08:00
yuanzonghao 75548ce005 Merge pull request #52 from zonghaoyuan/feat/story-share
feat(play): add encrypted story sharing with replay
2026-06-08 09:57:16 +08:00
yuanzonghao 39a7269494 fix(share): harden story share and relocate import button
- Add Content-Length pre-check to story-pack and story-unpack routes
  to reject oversized payloads before buffering the body
- Suppress internal error details in story-unpack catch (was leaking
  e.message to the client)
- Strengthen sceneIndex validation: require non-negative integer
- Guard against undefined storyState when replaying shared stories
- Fix prefetch regression: remove currentBeat?.id from useEffect deps
  that was re-triggering all change-scene prefetches on every beat
- Fix double detach: use else-if so the second replay detach guard
  doesn't fire redundantly after the first already detached
- Align client file-size limit by format (.json 12MB, .infiplot 13MB)
- Move "载入剧情" import button next to "开始" with hover tooltip

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-08 08:46:05 +08:00
Zonghao Yuan 88808b93b6 docs: 简化 Docker 部署流程,用 curl 下载替代仓库克隆 (#53)
* docs: simplify Docker deploy — download two files instead of cloning repo

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docs): use mkdir -p and guard against .env.local overwrite

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 22:39:01 +08:00
Zonghao Yuan cf83b9adea Merge pull request #54 from zonghaoyuan/chore/disable-pr-agent-auto-describe
chore(ci): disable PR Agent auto-describe and AI title
2026-06-07 22:38:12 +08:00
Zonghao Yuan 79d952c309 Merge pull request #51 from zonghaoyuan/feat/gallery-package
feat(gallery): 场景图集打包为 zip 下载并提取下载工具模块
2026-06-07 22:33:21 +08:00
yuanzonghao 867c52c24f fix(gallery): address review findings in zip download module
- Handle downloadImagesAsZip return value and surface errors to user
- Fix inferImageExtension garbage output for data URIs without semicolons
- Scale blob URL revocation delay for large zip files (>5MB → 60s)
- Cap uniqueZipPath dedup loop at 10k iterations with timestamp fallback
- Support relative URLs in inferImageExtension via base URL
- Handle svg+xml MIME subtype correctly

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 22:32:23 +08:00
yuanzonghao e39e9e1c86 chore(ci): disable PR Agent auto-describe and AI title
Collaborators' hand-written PR titles and descriptions were being
overwritten by the automatic /describe run. Disable auto_describe on the
Claude job and set generate_ai_title = false so human-authored metadata
is preserved. Manual /describe via PR comment still works.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 22:27:54 +08:00
baizhi958216 0abd5f1525 feat(play): add encrypted story sharing 2026-06-07 17:13:27 +08:00
Zonghao Yuan bd34fd6868 Merge pull request #49 from zonghaoyuan/staging
chore(ci): tune PR Agent — independent banners, bot guard, opus 4.6 model switch
2026-06-07 15:50:29 +08:00
baizhi958216 7925e9c459 feat(gallery): download scene gallery as zip
Signed-off-by: baizhi958216 <1475289190@qq.com>
2026-06-07 15:45:46 +08:00
Zonghao Yuan 3fc8d21b23 Merge pull request #48 from zonghaoyuan/chore/pr-agent-tune
chore(ci): tune PR Agent — independent banners, sharper findings, opus 4.6
2026-06-07 15:37:57 +08:00
yuanzonghao 63cc7a687e chore(ci): stop hiding *.md from PR Agent review
The previous "*.md" ignore glob hid best_practices.md and AGENTS.md from
the /review diff view (visible in PR #48 where the reviewer hallucinated
"this PR does not add a best_practices.md file"). README-style noise on
docs PRs is preferable to silently dropping changes to the project's
authoritative rule files.
2026-06-07 15:35:55 +08:00
yuanzonghao 81b99625d3 chore(ci): tune PR Agent config
- split per-model banners so two model jobs no longer overwrite each other
- raise reviewer findings cap to 8, broaden /improve to readability/cleanup
- enable dual-publishing for high-score suggestions (inline annotations)
- switch Claude model from opus-4-7 to opus-4-6 (fallback sonnet-4-6)
- raise reasoning_effort to high, response_language to zh-CN
- drop two dead config keys silently ignored by upstream schema
- add best_practices.md with 6 project-specific invariants for /improve
2026-06-07 15:29:34 +08:00
Zonghao Yuan 44ecf058ed Merge pull request #47 from zonghaoyuan/staging
Release: staging → main
2026-06-07 15:16:17 +08:00
yuanzonghao df48e73d62 fix(play): sync playerName to active session on settings save
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 15:02:57 +08:00
yuanzonghao 4972243a93 fix: address PR Agent review findings across 6 files
Restrict PR Agent workflow to trusted collaborators on PR comments only,
fix UTF-8 byte counting in gallery-pack, correct portrait-to-landscape
fallback orientation, track inserted freeform beats in visitedBeatIds,
allow clearing stored TTS key, and guard empty-string fuzzy match in
style selector.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 14:40:37 +08:00
Zonghao Yuan 74479c1aa6 Merge pull request #44 from zonghaoyuan/feat/vision-toggle
feat(play): add vision click setting
2026-06-07 14:24:14 +08:00
yuanzonghao 69ae1380cb fix(play): resolve hydration mismatch and fragile pace index
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 14:23:44 +08:00
yuanzonghao dc36b1fe9e feat(play): integrate vision click with unified settings modal
Merge vision-click toggle into the shared SettingsModal alongside
player name and TTS key configuration. Remove standalone TtsKeyModal.
Add settings gear button to PlayCanvas dialogue card and header.
Fix fullscreen settings modal not rendering in immersive mode.
Voice toggle uses standard CategorySelect dropdown matching other
tab bar options.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 14:15:22 +08:00
Zonghao Yuan 2700de2d9f Merge pull request #46 from zonghaoyuan/feat/new-art-styles
feat(web): add 14 new art styles with thumbnails and reorder style grid
2026-06-07 13:29:45 +08:00
yuanzonghao b57e36571d fix(web): bump thumbV to v5 to avoid stale thumbnail cache
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 13:28:45 +08:00
yuanzonghao 53868471c6 feat(web): add 14 new art styles with thumbnails and reorder style grid
Add 14 new painting styles sourced from preset story card generation
scripts: Dunhuang fresco, Persian miniature, Byzantine mosaic, stained
glass, vaporwave, vector illustration, low poly, pop art, glitch art,
papercut, steampunk, xianxia fantasy, dark fairytale, and urban fantasy.

Reorder all 36 styles into logical visual categories (anime → cinematic
→ Eastern traditional → Western traditional → genre → digital → handcraft)
for easier browsing. Update "auto" thumbnail to a 3×3 composite grid and
"custom" thumbnail to a paintbrush-on-canvas concept image.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 12:56:54 +08:00
Zonghao Yuan cdcf8513c0 Merge pull request #45 from zonghaoyuan/feat/player-name-and-freeform
feat(web): player name, freeform input & unified settings
2026-06-07 12:38:15 +08:00