Commit Graph

159 Commits

Author SHA1 Message Date
DESKTOP-I1T6TF3\Q cd7619265c chore(web): swap in 6 curated male covers
Replace m7 / m8 / m9 / m11 / m13 / m18 with hand-picked images from
the offline reroll set. Source files at 686×1200 (m7/m8/m9/m11) were
attention-cropped to the existing 960×1200 4:5 target via sharp's
fit:'cover' + position:'attention' — same pipeline as the bed4dc5
cover batch, so the new images blend into the masonry without visible
aspect-ratio drift. m13 (复古未来梦) and m18 (数据幽灵) were already
at 960×1200 and copied straight through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 04:11:26 +08:00
DESKTOP-I1T6TF3\Q cbabc54273 chore(engine): log worldSetting and storyBible at session start
Two lines in startSession: the full worldSetting being fed to the
Architect, and the resulting logline/genreTags/synopsis it produced.
Cheap to keep — fires once per session — and makes it possible to tell
at a glance whether a "story unrelated to my input" report is a frontend
transport bug, a worldSetting layout problem, or the LLM ignoring the
seed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 03:51:58 +08:00
DESKTOP-I1T6TF3\Q d241300ed6 fix(web): fall back to current Typewriter phrase + frontload it in worldSetting
Two related fixes so the home start button actually reflects what the
user sees:

1. Lift the Typewriter's current phrase index up to HomePage so start()
   can read which example is on screen right now. When the textarea is
   empty, start() now substitutes that phrase as the user's story seed —
   "what you see is what you play", instead of the previous behavior
   where an empty input produced a generic worldSetting with no plot
   direction and the model invented something unrelated.

2. Restructure the worldSetting string so the user prompt (or the
   chosen Typewriter phrase) sits at the top, alone, wrapped in a
   strong directive ("必须以此为剧情主线,不要偏离"). Before, the seed
   was a single line sandwiched between the gender/style/pace boilerplate
   and the generic "edit with dramatic tension" tail, which the Architect
   tended to skim past when expanding the bible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 03:51:51 +08:00
DESKTOP-I1T6TF3\Q eb9b875454 fix(web): use existing STYLE_MAP key for home 「自动」 fallback
After bed4dc5 renamed style keys to include the (Image N参考) suffix,
the home start() still resolved 「自动」 against the legacy bare name
「京阿尼细腻日常」, leaving styleGuide undefined and tripping the
/api/start required-field check on the default click.

Fall back to "Galgame CG 梦幻光影" — a key that actually exists in
STYLE_MAP — so the default path resolves cleanly without changing the
behavior of explicitly selected styles.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 03:37:49 +08:00
DESKTOP-I1T6TF3\Q bed4dc5a8f feat(web): gender-differentiated 4:5 covers + per-card styleGuide prebake
- Regenerate 60 covers (30 male + 30 female) via FLUX with story-specific
  prompts, replacing the prior gender-shared set
- Crop covers to 4:5 (960×1200) via sharp attention cover; matches new
  homepage card aspectRatio
- Persist all 60 prompts to public/home/prompts.json so the prebake step
  can reuse the cover's exact visual anchor (per-card styleGuide) and the
  first-act scene visually carries over from the poster the player clicked
- Restore /play?card= prebaked instant-play path on homepage card click
- Add OpenAI-compatible image route in ai-client for non-Runware endpoints
- Hide Next.js dev indicators globally; tweak F-key fullscreen label

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 02:26:35 +08:00
DESKTOP-I1T6TF3\Q 820a5f7e87 feat(web): refactor home preset cards to 16:9 poster style with titles and tags below cover 2026-06-03 02:25:02 +08:00
Zonghao Yuan 8ca818e4d7 Merge pull request #16 from zonghaoyuan/staging
Merge staging into main
2026-06-03 01:19:14 +08:00
Zonghao Yuan 6ddbe7d377 feat: add privacy-friendly Umami page-view analytics (#15)
Cookieless, env-gated page-view tracking via Umami. The <Analytics />
component injects the script only when NEXT_PUBLIC_UMAMI_SRC and
NEXT_PUBLIC_UMAMI_WEBSITE_ID are both set, so local dev and forks send
nothing to our instance. Adds .env.example docs (section 6) and a
homepage footer privacy disclosure. No Cookie consent banner needed.
2026-06-03 01:14:55 +08:00
Zonghao Yuan 639201cd38 docs: replace How it works mermaid with localized SVG diagrams (#14)
The auto-laid-out Mermaid flowchart looked rough. Replace it with a
hand-built Anthropic-style diagram: color-coded role cards, a dashed
scene-generation group, and the speculative pre-generation loop.

Each SVG bakes in a dark background so it renders consistently whether
the viewer's GitHub theme is light or dark (theme is per-viewer, not
per-repo). Ship one localized SVG per README (zh/en/ja) to preserve the
existing per-language diagrams, and strip claude.ai artifact residue
(onclick / var() / context-stroke).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 01:08:08 +08:00
Zonghao Yuan dc5ecd60f6 refactor: flatten monorepo to single web package (#12)
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.

- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 00:55:45 +08:00
yuanzonghao 9543c3dba1 docs: note Cloudflare deploy needs Workers Paid Plan
The scene pipeline needs more CPU time than the Workers Free 10ms cap
allows, so Cloudflare deploys require Workers Paid. The old "pick
whichever you prefer" implied false cost parity with Vercel (free
Hobby works), so recommend the one-click Vercel deploy for personal
use. Applied to all three READMEs (zh/en/ja).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 00:39:17 +08:00
Zonghao Yuan ee2c29ae63 Merge pull request #11 from zonghaoyuan/merge-staging-to-main
Merge staging into main (#10 的冲突已解决)
2026-06-03 00:07:43 +08:00
yuanzonghao 4725b2cf72 Merge staging into main (#10)
将 staging 全部进展合入 main:Cloudflare 部署、README 重构(中文默认 +
hero header + 版块重排 + 日文截图 + 切换器修复)、LICENSE 修复(恢复 AGPL-3.0
识别)、engine 点击标注迁移至浏览器 Canvas。冲突全部取 staging 版,移除已
废弃的 README.zh-CN.md 与 annotate.ts;合并后 main 树与 staging 完全一致。
2026-06-03 00:00:34 +08:00
yuanzonghao cd9e9012bf docs: point 简体中文 switcher to repo homepage
The relative `README.md` link resolved to GitHub's blob file view instead
of the repo landing page. Use the absolute repo URL so the switcher returns
to the homepage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 22:29:35 +08:00
yuanzonghao 6038216751 docs: reorder README sections, simplify Live Demo badge, add JP screenshots
- Move "How it works" above "Team & Vision" so the technical deep-dive
  follows the screenshots while reader interest is highest
- Simplify the Live Demo badge to a single segment (drop the domain; the
  badge still links to infiplot.com)
- Add the missing Screenshots section to the Japanese README (i18n gap:
  it was added to zh/en but never backported to ja)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 22:29:35 +08:00
Zonghao Yuan d263437756 Merge pull request #9 from zonghaoyuan/feat/cloudflare-migration
feat: add Cloudflare Workers deployment alongside Vercel
2026-06-02 22:14:52 +08:00
yuanzonghao 203e63edc2 fix: server-side payload cap + cleaner image abort
Addresses Copilot review on PR #9:

- /api/vision: add MAX_ANNOTATED_BYTES (3 MB) cap on annotatedImageBase64,
  plus an explicit type/non-empty check. Browser annotator resizes to 768
  wide (typically 200-800 KB base64), so 3 MB rejects abusive direct-API
  payloads that would otherwise inflate upstream vision LLM costs.
- annotateClient: replace `img.src = ""` on timeout with removeAttribute
  to avoid the legacy browser behavior of treating empty src as a
  navigation to the current document URL.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 22:13:40 +08:00
yuanzonghao 72331bb865 feat: add Cloudflare Workers deployment alongside Vercel
InfiPlot now deploys to either Vercel or Cloudflare Workers — both
targets are first-class. The project is fully stateless (sessions live
on the client), so the Cloudflare side needs only Workers + Workers
Assets and zero D1/KV/R2.

- apps/web/wrangler.jsonc — nodejs_compat, Assets binding, 60s CPU
  limit (Workers Paid required; matches vercel.json maxDuration). I/O
  wait does not count against this budget — fits the LLM-bound
  workload that's most of the runtime.
- apps/web/open-next.config.ts — minimal defineCloudflareConfig (no
  cache needed since the engine is stateless).
- apps/web/package.json — added build:cf / preview:cf / deploy:cf via
  @opennextjs/cloudflare + wrangler (both devDeps); sharp moved from
  dependencies to devDependencies (only used by the manual
  optimize-home-images.mjs / localize-firstact-images.mjs scripts now).
- .gitignore — .open-next, .wrangler, .dev.vars.
- READMEs (3 langs) — Deploy to Cloudflare button next to Vercel,
  plus a Cloudflare section in the env-var setup (wrangler secret put
  + Cloudflare Access for staging access control).

Verified: pnpm typecheck + pnpm build (Vercel path) + pnpm build:cf
(OpenNext bundle: worker 4 KB, server 24 MB, assets 32 MB / 186
files — all within Workers limits) + pnpm preview:cf with the full
play loop (start → scene → background click → CORS-clean Canvas
annotation via Runware CDN → vision LLM → insert-beat) all green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 21:47:03 +08:00
yuanzonghao 346d5359d4 refactor(engine): move click annotation from sharp to browser Canvas
The vision pipeline used sharp to draw a click marker on the scene image
server-side (engine/src/annotate.ts) and to render the MOCK_IMAGE
placeholder PNG (engine/src/mockImage.ts). Both moved off the runtime:

- annotateClick → apps/web/lib/annotateClient.ts (Canvas 2D in the
  browser; toDataURL → raw PNG base64 forwarded to /api/vision). Saves
  a server-side image re-fetch per click and frees the engine from
  sharp's native binding (which doesn't run on Cloudflare Workers).
- mockImageDataUri → self-describing SVG data URI (no rendering needed).

VisionRequest contract changes: prevImageUrl + click → annotatedImageBase64.
Server forwards the bytes straight to the vision LLM as image_url.

sharp is removed from packages/engine entirely and from next.config.ts's
serverExternalPackages. apps/web/package.json + lockfile cleanup ships
in the follow-up Cloudflare deployment commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 21:46:45 +08:00
yuanzonghao dd8b60c06b docs: move one-click deploy section directly below Live Demo
Surfaces the deploy entry point higher on the page so it isn't buried
near the bottom. The section keeps its inline link to the Configuration
guide, so the deploy flow is unaffected by the reorder.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 20:55:23 +08:00
yuanzonghao 236f5494c1 docs: add hero header to READMEs and make Chinese the default
- Add centered hero block: SVG wordmark banner, short tagline, and
  project-stat badges (stars/watchers/forks/issues + Live Demo, License,
  LINUX DO forum backlink)
- Swap default README to Chinese (targeting CN developers); English
  moves to README.en.md, Japanese stays README.ja.md
- Add SVG wordmark banner at docs/banner.svg
- Cross-link language switchers and fix per-language deploy envLink anchors

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 20:55:23 +08:00
yuanzonghao ee3a614d26 style(web): tidy the homepage QQ group caption
Drops the fa-qq penguin icon and the "扫码加入,或搜索群号" call-to-action
in favor of a plain "QQ群号:575404333" label — the QR right above already
implies scanning, and the column header names the group.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 19:32:37 +08:00
yuanzonghao abb6be5cdd feat(web): add QQ beta-community group to homepage + READMEs
Fills the long-empty "内 测 用 户 群" placeholder (was "群二维码 /
邀请链接(待补充)") on the homepage contact grid with the real QQ
group QR (group ID 575404333) plus a scan-or-search line.

Mirrors it across all three READMEs as a scan-to-join block right
after the contact line, rendered from apps/web/public/qq-group.webp
(760×760 QR-only crop with a white quiet zone, ~45KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 19:26:54 +08:00
DESKTOP-I1T6TF3\Q 3e132ce28b docs: drop the click-to-enlarge caption from README screenshots
GitHub markdown can't host an in-page lightbox or prev/next carousel —
all <script> is stripped server-side, so a clickable thumbnail can only
ever open the raw image in a new page. The hint line was misleading
about that interaction, so just remove it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 17:31:32 +08:00
DESKTOP-I1T6TF3\Q bce686a7eb docs: make README screenshots clickable to view full size
Wraps each <img> in an <a href="..."> linking to the same path so GitHub
opens the full-resolution image on click instead of just showing the
inline thumbnail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 17:24:44 +08:00
DESKTOP-I1T6TF3\Q d93c16d836 feat(web): 红果-style homepage + instant-play prebaked first acts
Rewrites all 64 homepage cards (32 男性向 + 32 女性向) as short-drama hook
stories (战神归来 / 重生分手前夜 / 系统选妃 / 穿成乙游男配 / 末世异能 / 民国
谍战 / 修真渡劫 …) and regenerates each cover via FLUX in its assigned art
style (12 styles spread across 64 cards) at 832×1024 ≈4:5.

Click-to-play path: cards now jump straight to /play?card=<name> and hydrate
Session from /home/firstact/<name>.json — the engine pipeline (Architect +
Writer + CharacterDesigner + Painter) has been pre-run for 44/64 cards. The
remaining 20 (m14/m29/f14..f31) are pending an LLM credit top-up; their
clicks fall through to live /api/start for now.

Runware-hosted first-scene images are downloaded into /home/firstscene/
and the JSONs are rewritten to point at the local webp, so click → first
image is bounded by local-disk decode (~100ms) instead of CDN round-trip.

Scripts:
- scripts/generate-home-images.mjs  — rewrites all 64 cover prompts, per-card
  styles baked into prompts, 832×1024 dims to match StoryCard aspect
- scripts/prebake-firstacts.mjs     — POST /api/start × 64 with concurrency
  4, saves StartResponse to public/home/firstact/<name>.json
- scripts/localize-firstact-images.mjs — downloads each prebaked imageUrl
  to public/home/firstscene/<name>.webp (q80, ≤1600px) and rewrites JSON

README: adds Screenshots section (3×3 gallery) to README.md / README.zh-CN.md,
9 in-game shots compressed to docs/screenshots/*.webp (7.5MB → 680KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 17:20:34 +08:00
DESKTOP-I1T6TF3\Q 9ae91dd3ed feat(play): hug-canvas action buttons, unified mute, enlarged back-link
UI layout (PlayCanvas + play/page.tsx):
- "F · 全 · 屏" button (renamed from 演 · 示 to match what users
  actually mean by F) floats above the canvas, right-aligned, via a
  new `aboveCanvas` ReactNode slot that lives on the relative
  inline-block image wrapper at `bottom-full right-0`. It hugs the
  actual image right edge regardless of aspect ratio.
- "有 · 声 / 静 · 音" button mirrors that on the left via a new
  `aboveCanvasLeft` slot.
- Both slots also render inside the loading placeholder so the two
  controls appear from frame one, before the scene image arrives.
- InfiPlot back-link grows from 15px to 22/26px (mobile/desktop) with
  a slightly larger arrow, matching the brand'\''s presence on the
  homepage hero.
- Canvas-bottom metadata row (image dims on left, tutorial hint on
  right) dropped. The "—" placeholder and "···" loading state looked
  like stray punctuation; users found them noisy.
- Footer collapses to a single centered "Ⅰ · Ⅰ" mark.

Audio gating logic (play/page.tsx):
- Collapse the two-flag audio gate into one source of truth. The
  homepage "语音配音" choice no longer lives in a separate
  `audioEnabledRef` flag that gates `fetchBeatAudio` independently
  of the in-page mute state. Instead the `muted` useState lazy
  initializer reads `sessionStorage["infiplot:custom"].audioEnabled`
  and projects it inversely (audioEnabled=false → muted=true) so
  the 静音/有声 button correctly reflects the homepage selection
  from the first frame. The in-page toggle remains the source of
  truth from then on (persisted to localStorage:infiplot:muted).
- This fixes a visible disconnect where picking "关闭" on the
  homepage left the play page showing 有声 because the in-page
  state had no link to the homepage choice.
- The sessionStorage read uses the renamed key "infiplot:custom"
  (the infiplot rename PR changed it from yume:custom on the home
  side but the play side hadn'\''t been updated to match).

No new TTS quota is ever burned while muted: fetchBeatAudio'\''s
mutedRef.current early-return is the only path to /api/beat-audio
and is checked before the fetch fires; mute transitions also abort
in-flight requests.
2026-06-02 15:42:49 +08:00
Zonghao Yuan cffe4da4ca docs: streamline 3 READMEs and fix EN language switcher (#6)
Slim overview across EN/zh/JA, drop badges/blockquote/contributing, trim LICENSE header; fix the English switcher to point at the repo homepage instead of the GitHub site root.
2026-06-02 15:33:08 +08:00
DESKTOP-I1T6TF3\Q 6da87df73a feat(web): home story-card polish + play page back-link rebrand
Home (apps/web/app/page.tsx):
- StoryCard locked to uniform aspectRatio "4 / 5". The previous
  "placeholder 4/5 → naturalRatio after onLoad" flow coupled card
  height to lazy-load order: cards still below the fold sat at the
  placeholder ratio while above-the-fold cards snapped to their
  image's actual ratio (1.6 landscape vs 0.75 portrait vs 1.23
  squarish), so the gallery looked inconsistent until a hard refresh
  re-decoded everything from cache synchronously. Fixed ratio +
  object-cover removes the coupling.
- StoryCard hover overlay collapsed from two sibling layers
  (backdrop-blur + mask-image + dark gradient sibling) into one
  element with a pure rgba(0,0,0,…) linear-gradient and an opacity
  transition. Chromium does not animate backdrop-filter cleanly when
  combined with mask-image on an empty element — the first hover
  frame shows a full rectangular blur before the mask kicks in, then
  snaps to the feathered shape ("矩形磨砂 → 渐变磨砂"). One layer,
  one transitioning property, no compositing race.

Play (apps/web/app/play/page.tsx):
- Header back-link "云梦" → "InfiPlot" using the same serif + italic
  ember "Plot" treatment as the homepage wordmark. Resolved against
  the parallel plain-text rebrand already on infiplot/staging by
  keeping the styled version for brand consistency.
2026-06-02 14:42:26 +08:00
Zonghao Yuan 588b668d14 Merge staging into main (#3)
* feat(engine): Architect agent + cross-scene StoryState coherence

Add a dedicated Architect LLM call at session start that expands the terse
world/style prompt into a persistent story bible (logline, genre, second-
person protagonist, cast, engineered opening hook). The bible seeds a
StoryState the Writer reads and patches every scene, carried + merged
across cuts (applyStoryStatePatch) so the story keeps a spine from beat
one instead of jumping between scenes.

- prompts: inject web-novel / short-drama / galgame craft into Writer +
  Architect; Writer emits storyStatePatch to update the running bible
- director: parallelize voice + non-entry portraits with the Painter
  (only entry-beat portraits block paint) to offset Architect latency
- architect: chat/parse guarded so a malformed response never aborts start
- types: StoryState / StoryStatePatch; required on Start/SceneResponse

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix (#2)

* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix

- LICENSE: add GNU AGPL v3 with InfiPlot copyright notice
- README.md: rewrite for open-source project, fix TTS description
  (TTS uses MiMo's own protocol, not OpenAI-compatible)
- README.zh-CN.md: add Simplified Chinese translation
- README.ja.md: add Japanese translation
- package.json: change license from UNLICENSED to AGPL-3.0-only

* fix: address Copilot review — .env.example TTS comment, zh-CN formatting

- .env.example: clarify TTS uses MiMo's own protocol, not OpenAI-compatible
- README.md: 'land paper after paper' → 'publish paper after paper'
- README.zh-CN.md: add spaces around '5 月', fix code formatting
  for model names (deepseek-v4-flash)

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 13:41:37 +08:00
Zonghao Yuan 9a3511f220 docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix (#2)
* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix

- LICENSE: add GNU AGPL v3 with InfiPlot copyright notice
- README.md: rewrite for open-source project, fix TTS description
  (TTS uses MiMo's own protocol, not OpenAI-compatible)
- README.zh-CN.md: add Simplified Chinese translation
- README.ja.md: add Japanese translation
- package.json: change license from UNLICENSED to AGPL-3.0-only

* fix: address Copilot review — .env.example TTS comment, zh-CN formatting

- .env.example: clarify TTS uses MiMo's own protocol, not OpenAI-compatible
- README.md: 'land paper after paper' → 'publish paper after paper'
- README.zh-CN.md: add spaces around '5 月', fix code formatting
  for model names (deepseek-v4-flash)
2026-06-02 13:39:54 +08:00
Zonghao Yuan 70d0927a3e Merge pull request #1 from zonghaoyuan/feature/story-harness-opt
feat(engine): Architect agent + cross-scene StoryState coherence
2026-06-02 13:24:46 +08:00
yuanzonghao 15ce03a912 feat(engine): Architect agent + cross-scene StoryState coherence
Add a dedicated Architect LLM call at session start that expands the terse
world/style prompt into a persistent story bible (logline, genre, second-
person protagonist, cast, engineered opening hook). The bible seeds a
StoryState the Writer reads and patches every scene, carried + merged
across cuts (applyStoryStatePatch) so the story keeps a spine from beat
one instead of jumping between scenes.

- prompts: inject web-novel / short-drama / galgame craft into Writer +
  Architect; Writer emits storyStatePatch to update the running bible
- director: parallelize voice + non-entry portraits with the Painter
  (only entry-beat portraits block paint) to offset Architect latency
- architect: chat/parse guarded so a malformed response never aborts start
- types: StoryState / StoryStatePatch; required on Start/SceneResponse

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 11:44:55 +08:00
Zonghao Yuan 16707cc255 Merge pull request #10 from zonghaoyuan/rename/infiplot
chore: complete @yume → @infiplot rename (post-PR#9)
2026-06-02 09:30:23 +08:00
yuanzonghao 8eda27f241 chore: complete @yume → @infiplot rename (post-PR#9)
PR #9 已完成首页和 layout 的视觉品牌迁移,此 commit 补齐剩余的
技术性改名 —— workspace 包名、source import、localStorage 键、
CSS keyframe、内部 header logo、.env.example、README。

- @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile)
- localStorage/sessionStorage: yume:* → infiplot:*
  (含 PR #9 新增的 yume:hintClosed)
- CSS keyframe yume-ripple → infiplot-ripple
- new/play 页面 header logo "云梦" → "InfiPlot"
- 代码注释中的「云梦」style 形容词删除(layout.tsx, page.tsx)
- 根 package.json name + description(描述跟齐 staging
  "AI 实时交互剧情游戏")
- README: tagline / Vercel deploy URL / 目录树 / engine 描述

保留:prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm
placeholder 的「视觉小说画风」(图像模型识别的风格名词)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 09:27:00 +08:00
Zonghao Yuan 5d0a5bb756 Merge pull request #9 from zonghaoyuan/feature/infiplot-low-fi-homepage
feat(web): InfiPlot low-fi homepage with AI-generated cards + gender-…
2026-06-02 01:30:29 +08:00
yuanzonghao 54fda1914c fix(web): gate play-page un-mute prefetch on actual mute transitions
On mount the mute effect fired alongside the scene effect (both call
prefetchSceneAudio), so the initial /api/beat-audio batch was dispatched
twice — the first set aborted mid-flight. Track the previous muted value
in a ref and only re-prefetch on a real transition, leaving the mount-time
synthesis to the scene effect. Addresses Copilot review on PR #9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 01:29:33 +08:00
yuanzonghao 8f94d3aa65 feat(web): editorial homepage rework — flat 2×30 stories, autosize input
Revises the InfiPlot homepage from the initial prototype pass.

Stories data model
- Replaces the artificial 7-hero + 16-gallery split with a flat
  per-gender model: 30 preset stories each for 男性向 / 女性向.
- Renames assets hero*/gallery* → m{0..29} / f{0..29}; same index
  shares aspect ratio across genders so the gender crossfade never
  jumps card height.
- Fills in the missing 女性向 set and expands both genders to 30.

Cards
- StoryCard measures aspect ratio at runtime from the loaded image
  (onLoad → naturalWidth/Height), fixing the frosted-caption band
  reflow on lazy image load. Drops ready/fallback props; single
  masonry map over STORIES[gender].

Hero input
- Single-line <input> → auto-growing <textarea> (rows=1, resize-none)
  so long prompts and long card seeds are fully visible. Enter submits,
  Shift+Enter inserts a newline.
- lining-nums on the input so digits sit on the baseline instead of
  Cormorant's default old-style figures.

Typography / styles
- layout.tsx: editorial fonts (Cormorant Garamond + Inter via
  --font-serif / --font-sans) + Font Awesome; drops Patrick Hand /
  Noto Sans SC and the hand-drawn SVG jitter filters.
- globals.css trimmed to the editorial base (paper grain, hairline,
  num, ripple); play/page.tsx font/style follow-up.

Scripts
- generate-home-images.mjs reworked into a flat 2×30 idempotent
  Runware FLUX.2 generator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 01:14:49 +08:00
DESKTOP-I1T6TF3\Q 136ceff69f feat(web): InfiPlot low-fi homepage with AI-generated cards + gender-reactive hero + audio toggle fix
Rebuilds the landing page from the prototype: 1900px scale-to-fit hero with
hand-drawn SVG-jitter frames, typewriter input + start button, 5 horizontal
collapsible category selectors (with style-picker modal), 7 scattered hero
cards over a 16-card masonry gallery, and project intro panel.

Each card is filled with a Runware FLUX.2 image, pre-generated and stored as
WebP (~2 MB total for 30 cards). Hero card content + image switches by
性向 (男性向 / 女性向); gallery stays shared.

Hover overlay on every card shows title + outline in a bottom-up dark
gradient, matching the prior homepage's interaction style.

Bug fixes uncovered by tracing the form-state → engine pipeline:
- 「语音配音:关闭」was previously stuffed into styleGuide (consumed only by
  FLUX, ignored by TTS). Now serialized as audioEnabled boolean in the
  sessionStorage payload; play page's fetchBeatAudio early-returns when
  false, so no /api/beat-audio request fires.
- 「绘画风格:自动」used to pass the literal Chinese phrase "由模型根据
  prompt 自动判断画风" to FLUX, which painted it as text. Now maps to the
  二次元/galgame default prompt.

Adds reusable scripts under apps/web/scripts/:
- generate-home-images.mjs — Runware FLUX.2 idempotent batch generator
- optimize-home-images.mjs — sharp WebP downscale + recompress

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-01 17:08:55 +08:00
Zonghao Yuan 774f3734fd Merge pull request #8 from zonghaoyuan/feature/vercel-deploy-prep
feat: Vercel Hobby deploy readiness — image URLs, jsonrepair, DeepSeek
2026-06-01 16:31:54 +08:00
yuanzonghao 42a09c42f8 fix: address Copilot review — SSRF validation + log truncation
- annotate.ts: add assertSafeUrl() to reject non-https/data URLs and
  private/reserved IPs (SSRF prevention); cap response body to 10 MB
- jsonParser.ts: truncate raw model output in error log to first 800
  chars to avoid flooding logs / leaking sensitive content
2026-06-01 16:29:08 +08:00
yuanzonghao addbede929 feat: Vercel Hobby deploy readiness — image URLs, jsonrepair, DeepSeek
- Move vercel.json to apps/web/ with correct route paths; cap scene route
  maxDuration 120→60s for Hobby. Root vercel.json removed. Vercel project's
  Root Directory must be set to apps/web (Deploy button URL passes this).
- Switch image transport from base64-in-JSON to Runware-hosted URLs:
  generateImage now uses outputType=URL and returns {imageUrl, imageUuid};
  StartResponse/SceneResponse carry imageUrl; VisionRequest carries
  prevImageUrl (server re-fetches the bytes for click annotation). This
  eliminates the 4.5MB serverless body-size risk.
- Painter and director prefer URL over UUID for referenceImages — the UUID
  returned by Runware imageInference isn't always recognized in the refs
  pipeline (surfaces as `failedToTransferImage`).
- Client preloads scene images via `new Image().decode()` before committing
  to React state, so URL transitions render instantly; prefetched scenes
  also warm the HTTP cache.
- jsonParser uses the jsonrepair package (replaces hand-rolled repair) and
  adds a targeted preRepair regex for the missing-key-close-quote pattern
  that jsonrepair couldn't disambiguate. Full raw model output dumped on
  failure for diagnostic visibility.
- Default text provider switched to DeepSeek v4-flash via direct API
  (significantly more stable JSON than MiMo v2.5-pro). VISION/TTS stay on
  MiMo (DeepSeek has no multimodal / TTS offerings).
- next.config: drop dead experimental.serverActions.bodySizeLimit (no
  server actions used).
- README: real Deploy button URL (zonghaoyuan/yume + root-directory=apps/web
  + TTS/MOCK_IMAGE in env list); refreshed env vars table with optional
  TTS section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-01 16:04:13 +08:00
Zonghao Yuan a426b82275 Merge pull request #7 from zonghaoyuan/fix/chat-error-handling
fix(ai-client): improve error handling in chat and vision functions
2026-05-31 13:07:39 +08:00
yuanzonghao b44f52de5d Merge fix/copilot-review: address Copilot review comments 2026-05-31 12:57:26 +08:00
yuanzonghao 4eb6c69af9 fix: address Copilot review comments
- Read response.text() before JSON.parse to avoid double serialization
- Use text.slice(0, 500) instead of JSON.stringify(json).slice(0, 500)
- Add typeof content !== 'string' check for stronger type validation
- Add explicit JSON parse error handling with try/catch
2026-05-31 12:57:22 +08:00
yuanzonghao 10359d1a01 fix(ai-client): improve error handling in chat and vision functions
- Add explicit check for empty choices array in both chat.ts and vision.ts
- Add optional chaining for message property access
- Throw descriptive error when API returns no content
- Use English comments consistent with project style
- Fixes debugging issues when upstream returns empty responses

Related to: chat.ts and vision.ts silent empty string return on malformed responses
2026-05-31 12:41:56 +08:00
yuanzonghao c610efcb26 fix(ai-client): improve error handling in chat function
- Add explicit check for empty choices array
- Add optional chaining for message property access
- Throw descriptive error when API returns no content
- Fixes debugging issues when upstream returns empty responses

Resolves: chat.ts silent empty string return on malformed responses
2026-05-31 12:38:55 +08:00
Zonghao Yuan def1b25bd9 feat(engine): multi-agent character consistency pipeline (#6)
* feat(types): Character.voiceDescription rename + visual fields + Scene.sceneKey

Prepares the type surface for the multi-agent scene pipeline:

- Character.description → voiceDescription (clearer pairing with new visualDescription)
- Character gains visualDescription (English appearance card for Painter) +
  basePortraitBase64 + basePortraitUuid (for Runware referenceImages reuse)
- Scene gains sceneKey (English slug for cross-scene img2img continuity) +
  imageUuid (Runware UUID of the scene's rendered image for cheap seedImage
  reuse on subsequent same-sceneKey calls)
- Beat gains activeCharacters[] so the Cinematographer can read which
  characters are on-screen + their poses when composing the establishing shot

Co-Authored-By: QiChen88 <2291969160@qq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(ai-client): generateImage img2img + multi-reference options + uploadImage

Extends the Runware adapter to support the two anchoring mechanisms FLUX.2
[klein] 9B KV needs for character + scene visual consistency:

- generateImage gains optional { seedImage, referenceImages, strength }:
  seedImage drives img2img (single starting image, sceneKey continuity),
  referenceImages drives multi-reference anchoring (up to 4 character
  portraits, capped per Runware spec). Default strength 0.85 — FLUX
  ignores strength < 0.8.
- uploadImage POSTs a base64 to Runware's imageUpload taskType and
  returns the UUID, so portraits/scene snapshots can be referenced by
  UUID on subsequent calls instead of resending base64 every scene.

Co-Authored-By: QiChen88 <2291969160@qq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(engine): multi-agent scene pipeline (Writer→CharDesigner+Cinematographer→Painter)

Replaces the single-LLM directScene with a four-agent pipeline that
specializes each concern and parallelizes the slow parts. Adopts the
core idea from #4 (multi-agent dispatch + character visual consistency)
and grafts it onto the Scene/Beat architecture introduced in #2.

Pipeline per Scene (~9-12s critical path with parallelization):

  Writer LLM (序列, ~3s)
    │ outputs: sceneSummary + sceneKey + beats[] (each beat carries
    │           activeCharacters[] with poses)
    │
    ├─ CharacterDesigner LLM × N new chars (并行)
    │     │ outputs: { visualDescription (英文外貌卡), voiceDescription (中文音色卡) }
    │     ├─ FLUX portrait gen → upload → UUID (并行 within agent)
    │     └─ Xiaomi MiMo voicedesign provision (并行 within agent)
    │
    └─ Cinematographer LLM (并行 with CharacterDesigner)
          outputs: { shotType, integratedPrompt (英文构图+机位+人物站位) }

  Painter (FLUX img2img + referenceImages, ~1-3s)
    inputs: integratedPrompt + onStageCharacters' archetype block
            + (optional) prior sceneKey-hit scene as seedImage
            + (optional) character portrait UUIDs as referenceImages
    fallback chain: A) both anchors → B) refs only (保角色) →
                    C) seed only (保背景) → D) pure t2i
    output uploaded → Scene.imageUuid for the next sceneKey hop

Why this carving:
- Writer focuses purely on narrative (drops the voice-design duty
  staging's DIRECTOR_SYSTEM was carrying as a side concern).
- CharacterDesigner bundles visual + voice so the agent that thinks
  "who is this character" produces internally-consistent appearance +
  vocal personality (split agents tend to diverge).
- Cinematographer doesn't need character visualDescriptions —
  Painter appends archetypes after — so it parallelizes with
  CharacterDesigner.
- sceneKey enables cross-scene backdrop continuity that Scene/Beat
  doesn't cover (Scene/Beat only reuses backdrop WITHIN a scene's
  beats; sceneKey reuses across scenes that share a location).

Other changes:
- voice.ts loses provisionVoicesForScene (moved into CharacterDesigner);
  keeps synthesizeBeat for the lazy per-beat /api/beat-audio path.
- renderer.ts deleted (replaced by agents/painter.ts).
- directInsertBeat (vision-driven in-scene exploration) stays single-
  LLM — it forbids new characters and produces no image, so multi-
  agent doesn't apply.

apps/web is unchanged: orchestrator.ts keeps the same exports
(startSession / requestScene / visionDecide / requestInsertBeat /
requestBeatAudio) with identical request/response shapes.

Co-Authored-By: QiChen88 <2291969160@qq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(engine): Pattern B player POV + JSON repair + drop seedImage tier

Three hotfixes surfaced by manual end-to-end testing of the multi-agent
pipeline.

F1 — Player viewpoint (galgame Pattern B):
  - Writer accepts speaker="你" for player dialog (renders in dialog box,
    never TTS'd because no Character record exists for "你"). Filter POV
    variants (玩家/我/主角/protagonist/player/I/me/...) from
    activeCharacters so CharacterDesigner never wastes API calls on the
    player. Two-layer defense: explicit prompt rule in WRITER_SYSTEM +
    code normalization (POV_VARIANTS set, isPovName, normalizeSpeakerName).
  - Cinematographer and Painter prompts gain "player never in frame" rule
    so the player never appears in any rendered scene.
  - Cinematographer gains dynamic camera policy driven by the entry beat's
    speaker: NPC-speaker → close-up looking toward camera; "你"-speaker →
    medium shot of attentive NPC; no speaker → wide establishing shot.
  - director.ts filters POV from orphanSpeakers so provisionVoiceForName
    never fires for "你".

F2 — JSON parsing robustness:
  - parseJsonLoose gains a 4th repair tier: strip JS-style comments, strip
    trailing commas, insert missing commas between adjacent objects /
    arrays / quoted values. Logs the first 800 chars of raw LLM output
    when all repair attempts fail, so we can see what the model emitted.

F3 — Drop seedImage, use referenceImages for prior scene:
  - FLUX.2 [klein] 9B KV does not support seedImage (img2img). Removed
    Tier A (seedImage+refs) and Tier C (seedImage only) from the Painter
    degradation chain. New layout: prior scene's image slots into
    referenceImages[0] for spatial continuity, character portraits fill
    slots 1-3 (Runware caps at 4 total). Cinematographer instructed to
    emphasize continuity when sceneKey matches a prior scene.

All five package typechecks pass.

Co-Authored-By: QiChen88 <2291969160@qq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(engine): address Copilot review feedback on #6

Three targeted fixes from PR #6 Copilot review.

F4 — Stale seedImage/img2img docstrings
  Four locations still referenced the original img2img design after F3
  switched to referenceImages-based spatial continuity:
  - types/index.ts:57   Scene.sceneKey docstring
  - types/index.ts:63   Scene.imageUuid docstring
  - director.ts:34      pipeline diagram in module block comment
  - director.ts:128     directScene JSDoc
  Doc-only changes; misleading wording corrected to mention referenceImages.
  (The design-rationale comment in pickPriorSceneReference is kept — it
  explains WHY we don't use seedImage and is load-bearing context.)

F5 — Remove JS-comment stripping from JSON repair pass
  parseJsonLoose's repair tier previously stripped `// ...` and
  `/* ... */` across the entire text, which would corrupt JSON string
  values containing URLs (e.g. "https://example.com" → "https:"). Since
  LLMs in `responseFormat: "json_object"` mode essentially never emit
  comments, dropping the comment-stripping step is a net win for safety.
  Trailing-comma and missing-comma repair (the high-frequency failures)
  are kept.

F6 — Pattern B parity on the insert-beat path
  Previously: directInsertBeat's INSERT_BEAT_SYSTEM forbade any speaker
  not in session.characters, and the orchestrator's unregistered-speaker
  guard demoted such lines to narration. This meant the player could not
  speak via speaker="你" in transient in-scene beats — inconsistent with
  the Writer path.
  Fix:
  - INSERT_BEAT_SYSTEM prompt now allows speaker="你" (NPC name OR "你")
    and rejects other POV variants
  - directInsertBeat applies normalizeSpeakerName to the LLM output, same
    as the Writer path, so POV variants collapse to "你"
  - lineDelivery is dropped when speaker="你" (no TTS for player)
  - orchestrator's unregistered-speaker guard adds a `speaker !== "你"`
    exception so Pattern B player dialog passes through

Co-Authored-By: QiChen88 <2291969160@qq.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(engine): drop "JS-style comments" from parseJsonLoose header

The function header listed JS-style comments as a step-4 repair, but F5
already removed comment stripping from `repairJsonString` because the
regex would corrupt URLs inside JSON string values. The inner function's
comment was updated then; this header was missed.

Doc-only sync from second-round Copilot review on #6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: QiChen88 <2291969160@qq.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 13:30:24 +08:00
Zonghao Yuan e261f4a346 feat: Runware FLUX.2 image + lazy per-beat TTS (#5)
Reduce median scene-load latency from ~30-80s to ~17-25s by switching image generation to Runware FLUX.2 [klein] 9B KV and moving per-beat TTS synthesis off the scene response into a new lazy /api/beat-audio endpoint with hard timeout + abort support.

- feat(image): migrate to Runware FLUX.2 [klein] 9B KV — task-array API, $0.001/image, sub-second inference.
- feat(tts): split /api/scene into directScene + image + voicedesign-provisioning; lazily synth per beat via /api/beat-audio with 15s hard timeout + AbortSignal threaded to MiMo so timed-out calls don't keep burning sockets/quota; client fans out per-beat fetches on scene-id change with abort + identity-check finally to prevent cross-scene beat-id collisions.
- refactor(tts): slim BeatAudioRequest to { beat, voice } — ~800KB per-beat upload dropped to ~160KB by sending only the speaker's voice instead of the full session.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-28 23:43:51 +08:00
Zonghao Yuan fcd4e6c1ab feat(tts): Xiaomi MiMo per-beat voice + MOCK_IMAGE testing aid (#3)
Adds optional Xiaomi MiMo TTS layer on top of the scene/beat engine and a MOCK_IMAGE flag for cheap local TTS iteration.

- Per-character voice provisioning via MiMo voice design → clone, reference audio persisted in session
- Per-line free-form delivery direction (Director writes "鼓起勇气又害羞,声音发颤" style instructions; sent to MiMo's director channel, never read aloud)
- Per-beat audio served with the scene response; frontend plays via hidden <audio> with typewriter synced to audio duration; mute toggle persisted via localStorage lazy initializer
- Graceful degradation: any TTS step failing → silent beat, game continues
- MOCK_IMAGE=true returns a sharp-generated placeholder PNG so local TTS iteration doesn't burn image tokens
- Recommended config in .env.example: MiMo Token Plan covers TEXT/VISION/TTS with one key (mimo-v2.5-pro for text, mimo-v2.5 omni for vision, mimo-v2.5-tts for TTS)

Squashed from #3:
- feat(tts): 小米 MiMo 逐 beat 配音 + 按 session 角色音色 + 自由文本配音指导
- feat(engine): MOCK_IMAGE 占位图便于本地测试
- fix(tts): address Copilot review on PR #3
- fix(tts): Copilot round-2 review feedback

Known limitation: Session.characters carries the full WAV reference audio (~200-300KB/character base64) and round-trips through every /api/scene, /api/vision, /api/insert-beat request. This is intrinsic to MiMo's design→clone model (voice identity IS the audio, no server-side voiceId). Fixing requires server-side storage which is out of scope; documented for future hardening.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-28 20:45:21 +08:00