Files
infiplot-web/app/api/beat-audio
yuanzonghao e88e988de3 fix(web): reduce FOT by stripping redundant voice data from transport
Three transport-only optimizations that cut per-session Vercel FOT by ~50-60%:

P0 — Server strips voice.referenceAudioBase64 from already-known characters
in /api/scene and /api/insert-beat responses (defense-in-depth).

P1 — Client strips all voice data from session before sending to
/api/scene, /api/vision, and /api/insert-beat. Voices are retained locally
and re-merged from responses via mergeCharactersPreserveVoice(). The engine
only needs character names + visualDescriptions for scene generation.

P3 — /api/beat-audio returns binary audio (Response with Content-Type)
instead of JSON-wrapped base64, saving ~33% encoding overhead. Client
converts to blob URLs; PlayCanvas accepts a single audioSrc prop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-05 00:24:34 +08:00
..