Commit Graph

11 Commits

Author SHA1 Message Date
DESKTOP-I1T6TF3\Q d93c16d836 feat(web): 红果-style homepage + instant-play prebaked first acts
Rewrites all 64 homepage cards (32 男性向 + 32 女性向) as short-drama hook
stories (战神归来 / 重生分手前夜 / 系统选妃 / 穿成乙游男配 / 末世异能 / 民国
谍战 / 修真渡劫 …) and regenerates each cover via FLUX in its assigned art
style (12 styles spread across 64 cards) at 832×1024 ≈4:5.

Click-to-play path: cards now jump straight to /play?card=<name> and hydrate
Session from /home/firstact/<name>.json — the engine pipeline (Architect +
Writer + CharacterDesigner + Painter) has been pre-run for 44/64 cards. The
remaining 20 (m14/m29/f14..f31) are pending an LLM credit top-up; their
clicks fall through to live /api/start for now.

Runware-hosted first-scene images are downloaded into /home/firstscene/
and the JSONs are rewritten to point at the local webp, so click → first
image is bounded by local-disk decode (~100ms) instead of CDN round-trip.

Scripts:
- scripts/generate-home-images.mjs  — rewrites all 64 cover prompts, per-card
  styles baked into prompts, 832×1024 dims to match StoryCard aspect
- scripts/prebake-firstacts.mjs     — POST /api/start × 64 with concurrency
  4, saves StartResponse to public/home/firstact/<name>.json
- scripts/localize-firstact-images.mjs — downloads each prebaked imageUrl
  to public/home/firstscene/<name>.webp (q80, ≤1600px) and rewrites JSON

README: adds Screenshots section (3×3 gallery) to README.md / README.zh-CN.md,
9 in-game shots compressed to docs/screenshots/*.webp (7.5MB → 680KB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 17:20:34 +08:00
Zonghao Yuan cffe4da4ca docs: streamline 3 READMEs and fix EN language switcher (#6)
Slim overview across EN/zh/JA, drop badges/blockquote/contributing, trim LICENSE header; fix the English switcher to point at the repo homepage instead of the GitHub site root.
2026-06-02 15:33:08 +08:00
Zonghao Yuan 9a3511f220 docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix (#2)
* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix

- LICENSE: add GNU AGPL v3 with InfiPlot copyright notice
- README.md: rewrite for open-source project, fix TTS description
  (TTS uses MiMo's own protocol, not OpenAI-compatible)
- README.zh-CN.md: add Simplified Chinese translation
- README.ja.md: add Japanese translation
- package.json: change license from UNLICENSED to AGPL-3.0-only

* fix: address Copilot review — .env.example TTS comment, zh-CN formatting

- .env.example: clarify TTS uses MiMo's own protocol, not OpenAI-compatible
- README.md: 'land paper after paper' → 'publish paper after paper'
- README.zh-CN.md: add spaces around '5 月', fix code formatting
  for model names (deepseek-v4-flash)
2026-06-02 13:39:54 +08:00
yuanzonghao 8eda27f241 chore: complete @yume → @infiplot rename (post-PR#9)
PR #9 已完成首页和 layout 的视觉品牌迁移,此 commit 补齐剩余的
技术性改名 —— workspace 包名、source import、localStorage 键、
CSS keyframe、内部 header logo、.env.example、README。

- @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile)
- localStorage/sessionStorage: yume:* → infiplot:*
  (含 PR #9 新增的 yume:hintClosed)
- CSS keyframe yume-ripple → infiplot-ripple
- new/play 页面 header logo "云梦" → "InfiPlot"
- 代码注释中的「云梦」style 形容词删除(layout.tsx, page.tsx)
- 根 package.json name + description(描述跟齐 staging
  "AI 实时交互剧情游戏")
- README: tagline / Vercel deploy URL / 目录树 / engine 描述

保留:prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm
placeholder 的「视觉小说画风」(图像模型识别的风格名词)。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-02 09:27:00 +08:00
yuanzonghao addbede929 feat: Vercel Hobby deploy readiness — image URLs, jsonrepair, DeepSeek
- Move vercel.json to apps/web/ with correct route paths; cap scene route
  maxDuration 120→60s for Hobby. Root vercel.json removed. Vercel project's
  Root Directory must be set to apps/web (Deploy button URL passes this).
- Switch image transport from base64-in-JSON to Runware-hosted URLs:
  generateImage now uses outputType=URL and returns {imageUrl, imageUuid};
  StartResponse/SceneResponse carry imageUrl; VisionRequest carries
  prevImageUrl (server re-fetches the bytes for click annotation). This
  eliminates the 4.5MB serverless body-size risk.
- Painter and director prefer URL over UUID for referenceImages — the UUID
  returned by Runware imageInference isn't always recognized in the refs
  pipeline (surfaces as `failedToTransferImage`).
- Client preloads scene images via `new Image().decode()` before committing
  to React state, so URL transitions render instantly; prefetched scenes
  also warm the HTTP cache.
- jsonParser uses the jsonrepair package (replaces hand-rolled repair) and
  adds a targeted preRepair regex for the missing-key-close-quote pattern
  that jsonrepair couldn't disambiguate. Full raw model output dumped on
  failure for diagnostic visibility.
- Default text provider switched to DeepSeek v4-flash via direct API
  (significantly more stable JSON than MiMo v2.5-pro). VISION/TTS stay on
  MiMo (DeepSeek has no multimodal / TTS offerings).
- next.config: drop dead experimental.serverActions.bodySizeLimit (no
  server actions used).
- README: real Deploy button URL (zonghaoyuan/yume + root-directory=apps/web
  + TTS/MOCK_IMAGE in env list); refreshed env vars table with optional
  TTS section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-01 16:04:13 +08:00
Zonghao Yuan e261f4a346 feat: Runware FLUX.2 image + lazy per-beat TTS (#5)
Reduce median scene-load latency from ~30-80s to ~17-25s by switching image generation to Runware FLUX.2 [klein] 9B KV and moving per-beat TTS synthesis off the scene response into a new lazy /api/beat-audio endpoint with hard timeout + abort support.

- feat(image): migrate to Runware FLUX.2 [klein] 9B KV — task-array API, $0.001/image, sub-second inference.
- feat(tts): split /api/scene into directScene + image + voicedesign-provisioning; lazily synth per beat via /api/beat-audio with 15s hard timeout + AbortSignal threaded to MiMo so timed-out calls don't keep burning sockets/quota; client fans out per-beat fetches on scene-id change with abort + identity-check finally to prevent cross-scene beat-id collisions.
- refactor(tts): slim BeatAudioRequest to { beat, voice } — ~800KB per-beat upload dropped to ~160KB by sending only the speaker's voice instead of the full session.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-28 23:43:51 +08:00
Zonghao Yuan d1f13d51a3 feat: scene/beat architecture — decouple dialogue from image generation (#2)
Replace the one-image-per-interaction model with scenes that hold multiple
dialogue beats. The image regenerates only on scene-change actions; tapping
through beats and in-scene choices are instant and zero-network.

Squashed from #2:
- feat: scene/beat architecture — decouple dialogue from image generation
- fix: harden LLM-output parsing, prefetch lifecycle, and typewriter (PR review)
- fix: dedupe beat ids; fallback narration on empty insert-beat (PR review #2)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
2026-05-28 15:20:12 +08:00
yuanzonghao 2793c06278 refactor: rename project DADA → 云梦 (slug: yume)
- 所有 workspace 包 @dada/* → @yume/*,根包 dada → yume
- 全部导入路径同步更新
- 内部 ID 对齐:dada-ripple → yume-ripple,dada:custom → yume:custom
- 首页 / new / play 用户文案整段中文化,保留 smallcaps + 衬线 + 罗马数字排版语汇
- README 标题改为 "# 云梦",部署链接与目录树 slug 改为 yume
- 重新生成 pnpm-lock.yaml

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-24 10:14:14 +08:00
yuanzonghao d0f2868834 chore: drop MIT license and open-source framing
Project is now private; remove LICENSE file, README license
section, and "MIT · MMXXVI" footer tags. Root package.json
license set to UNLICENSED.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 13:18:07 +08:00
yuanzonghao 9cedfa66e4 feat: prefetch, vision split, provider adapter, UI polish
Engine
- Split /api/vision out from /api/interact so client can drive
  prefetch + cache lookup independently of click interpretation
- Image client switched to chat-completions+modalities API (OpenRouter/
  provider style), supporting markdown image URL responses
- annotateClick now resizes to 768w before composite to keep vision
  payloads small and avoid CDN timeouts
- Prompts updated to mention "JSON" in user messages (required by
  Gemini's strict JSON mode)
- Shared fetchWithRetry helper: 2 retries for chat/image, 0 for vision
  (with 60s hard timeout)

Client
- Parallel prefetch of all three choice branches on each new frame
- Effect deliberately excludes phase from deps so user-click doesn't
  abort in-flight prefetches
- Cache hit/miss/free-form fallback handled in handleClick
- PlayCanvas reads img naturalWidth/Height and adapts container to
  whatever aspect AI returns (no more cropped third choice)
- max-width raised to 560px, max-height calc(100dvh - 200px)

Misc
- README env-path corrected to apps/web/.env.local
- users.md: BGM/TTS idea note
- .env.example moved into apps/web alongside next config

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 19:38:03 +08:00
yuanzonghao cbd95bbea2 Initial commit: AI-driven visual novel scaffold
- Monorepo (pnpm workspace): apps/web + packages/{types,ai-client,engine}
- Next.js 16 web app with three-stage AI orchestration
- Three independently configurable providers: text LLM, image generator, vision model
- Warm minimalist editorial UI design
- One-click Vercel deploy ready

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 13:29:58 +08:00