refactor(engine): move click annotation from sharp to browser Canvas

The vision pipeline used sharp to draw a click marker on the scene image
server-side (engine/src/annotate.ts) and to render the MOCK_IMAGE
placeholder PNG (engine/src/mockImage.ts). Both moved off the runtime:

- annotateClick → apps/web/lib/annotateClient.ts (Canvas 2D in the
  browser; toDataURL → raw PNG base64 forwarded to /api/vision). Saves
  a server-side image re-fetch per click and frees the engine from
  sharp's native binding (which doesn't run on Cloudflare Workers).
- mockImageDataUri → self-describing SVG data URI (no rendering needed).

VisionRequest contract changes: prevImageUrl + click → annotatedImageBase64.
Server forwards the bytes straight to the vision LLM as image_url.

sharp is removed from packages/engine entirely and from next.config.ts's
serverExternalPackages. apps/web/package.json + lockfile cleanup ships
in the follow-up Cloudflare deployment commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-06-02 21:46:45 +08:00
parent dd8b60c06b
commit 346d5359d4
10 changed files with 119 additions and 154 deletions
+2 -2
View File
@@ -14,9 +14,9 @@ export async function POST(req: Request) {
return NextResponse.json({ error: "Invalid JSON" }, { status: 400 });
}
if (!body.session || !body.prevImageUrl || !body.click) {
if (!body.session || !body.annotatedImageBase64) {
return NextResponse.json(
{ error: "session, prevImageUrl, click are required" },
{ error: "session and annotatedImageBase64 are required" },
{ status: 400 },
);
}
+3 -1
View File
@@ -11,6 +11,7 @@ import {
useState,
} from "react";
import { PlayCanvas, type Phase } from "@/components/PlayCanvas";
import { annotateClick } from "@/lib/annotateClient";
import { PRESETS } from "@/lib/presets";
import type {
Beat,
@@ -746,10 +747,11 @@ function PlayInner() {
setPendingClick(click);
try {
const annotatedImageBase64 = await annotateClick(imageUrl, click);
const visionRes = await fetch("/api/vision", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ session, prevImageUrl: imageUrl, click }),
body: JSON.stringify({ session, annotatedImageBase64 }),
});
if (!visionRes.ok) {
const j = (await visionRes.json().catch(() => ({}))) as {