feat: Vercel Hobby deploy readiness — image URLs, jsonrepair, DeepSeek

- Move vercel.json to apps/web/ with correct route paths; cap scene route
  maxDuration 120→60s for Hobby. Root vercel.json removed. Vercel project's
  Root Directory must be set to apps/web (Deploy button URL passes this).
- Switch image transport from base64-in-JSON to Runware-hosted URLs:
  generateImage now uses outputType=URL and returns {imageUrl, imageUuid};
  StartResponse/SceneResponse carry imageUrl; VisionRequest carries
  prevImageUrl (server re-fetches the bytes for click annotation). This
  eliminates the 4.5MB serverless body-size risk.
- Painter and director prefer URL over UUID for referenceImages — the UUID
  returned by Runware imageInference isn't always recognized in the refs
  pipeline (surfaces as `failedToTransferImage`).
- Client preloads scene images via `new Image().decode()` before committing
  to React state, so URL transitions render instantly; prefetched scenes
  also warm the HTTP cache.
- jsonParser uses the jsonrepair package (replaces hand-rolled repair) and
  adds a targeted preRepair regex for the missing-key-close-quote pattern
  that jsonrepair couldn't disambiguate. Full raw model output dumped on
  failure for diagnostic visibility.
- Default text provider switched to DeepSeek v4-flash via direct API
  (significantly more stable JSON than MiMo v2.5-pro). VISION/TTS stay on
  MiMo (DeepSeek has no multimodal / TTS offerings).
- next.config: drop dead experimental.serverActions.bodySizeLimit (no
  server actions used).
- README: real Deploy button URL (zonghaoyuan/yume + root-directory=apps/web
  + TTS/MOCK_IMAGE in env list); refreshed env vars table with optional
  TTS section.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-06-01 16:04:13 +08:00
parent a426b82275
commit addbede929
21 changed files with 392 additions and 325 deletions
+32 -18
View File
@@ -56,17 +56,24 @@ export type Scene = {
* e.g. "classroom-dusk", "rooftop-night". When the next Scene shares this
* key, the Painter slots the previous Scene's image into Runware's
* `referenceImages` (alongside character portraits) so the same physical
* space stays visually consistent across cuts. (Originally planned as a
* seedImage / img2img anchor, but FLUX.2 [klein] 9B KV does not support
* seedImage — referenceImages serves the same purpose with the model.)
* space stays visually consistent across cuts.
*/
sceneKey?: string;
/**
* Runware UUID of this Scene's generated image — once uploaded, subsequent
* Scenes that match sceneKey can reference it via `referenceImages`
* without resending base64.
* Runware UUID of this Scene's generated image. Cheapest form to send back
* to Runware's `referenceImages` in subsequent calls (UUID > URL > base64
* in transport cost). Not shown to the client — `imageUrl` is what renders.
*/
imageUuid?: string;
/**
* Public CDN URL of this Scene's generated image. Returned to the client for
* `<img src>` rendering, and is what the client passes back to `/api/vision`
* as `prevImageUrl` so the server can re-fetch the bytes for click annotation.
*
* For MOCK_IMAGE=true this is a `data:image/png;base64,...` data URI, not a
* Runware URL — the client renders both forms transparently.
*/
imageUrl?: string;
};
export type SceneExit =
@@ -111,17 +118,17 @@ export type Character = {
*/
visualDescription?: string;
/**
* Base portrait image generated by the CharacterDesigner once, then reused
* as a Runware `referenceImages` entry in every subsequent scene the
* character appears in. Stored as base64 for client display.
*/
basePortraitBase64?: string;
/**
* Runware UUID for the base portrait. Once uploaded via the image-upload
* endpoint, subsequent Painter calls reference this UUID instead of
* resending the full base64 payload.
* Runware UUID for the base portrait. Generated by the CharacterDesigner
* once, reused as a `referenceImages` entry on every subsequent scene the
* character appears in. UUID is the cheapest reference form for Runware.
*/
basePortraitUuid?: string;
/**
* Public CDN URL for the base portrait. Same image as `basePortraitUuid`;
* kept around for the client (if it ever wants to render character cards)
* and as a fallback reference form for `referenceImages` when UUID is absent.
*/
basePortraitUrl?: string;
/** Xiaomi MiMo voice reference audio. */
voice?: CharacterVoice;
};
@@ -196,7 +203,8 @@ export type StartRequest = {
export type StartResponse = {
sessionId: string;
scene: Scene;
imageBase64: string;
/** Public CDN URL (or data URI in MOCK_IMAGE mode) for the rendered scene background. */
imageUrl: string;
/** Character registry with voice references + visual cards provisioned. */
characters: Character[];
};
@@ -210,7 +218,8 @@ export type SceneRequest = {
export type SceneResponse = {
scene: Scene;
imageBase64: string;
/** Public CDN URL (or data URI in MOCK_IMAGE mode) for the rendered scene background. */
imageUrl: string;
characters: Character[];
};
@@ -235,7 +244,12 @@ export type BeatAudioResponse = {
// trigger a scene change.
export type VisionRequest = {
session: Session;
prevImageBase64: string;
/**
* Public CDN URL (or data URI in MOCK_IMAGE mode) of the scene the player
* just clicked. The server re-fetches the bytes to annotate the click and
* pass an OpenAI-compatible image_url to the vision LLM.
*/
prevImageUrl: string;
click: { x: number; y: number };
};