Files
infiplot-web/lib/ai-client/chat.ts
T
Zonghao Yuan 0dea2f8e36 fix(ai-client): clean up regressions from OpenAI SDK migration and canvas frame fix (#74)
Three follow-ups to ef3b579 (OpenAI SDK migration) and ebe39ef (canvas frame):

- .env.example / config.ts / AGENTS.md: anthropic & google native protocols
  were removed with the Vercel AI SDK, but .env.example and AGENTS.md still
  advertised them. Rewrite the docs to point Claude/Gemini at their
  OpenAI-compatible endpoints (api.anthropic.com/v1,
  generativelanguage.googleapis.com/v1beta/openai), drop the dead Gemini
  "Nano Banana" image example, sync AGENTS.md (text/vision protocol list,
  image protocol list, the "OpenAI/Gemini via AI SDK" reference note), and
  append a short hint in readProvider() error message guiding
  anthropic/google users to openai_compatible instead of a bare rejection.

- chat.ts: drop the unsafe `as { prompt_tokens_details?: ... }` cast; read
  cached_tokens straight off the SDK's CompletionUsage type. Add a comment
  noting the OpenAI usage object reports cache reads only (no cache-write
  count), so the create cost the old AI SDK path logged is unrecoverable.

- PlayCanvas.tsx: revert <img key={imageUrl}> to key={imageUrl.slice(-48)}.
  The gpt-image/mock paths emit multi-MB data URIs; using the full string as
  React's reconciliation key adds avoidable diff overhead during the frequent
  re-renders. Matches the existing <audio> element's key convention.

Validation: pnpm typecheck passes. (pnpm lint fails on a pre-existing Next 16
`next lint` CLI issue, identical on staging — unrelated to this change.)
2026-06-14 13:36:19 +08:00

64 lines
2.1 KiB
TypeScript

import OpenAI from "openai";
import type { ProviderConfig } from "@infiplot/types";
import { normalizeBaseUrl } from "./normalizeUrl";
export type ChatMessage = {
role: "system" | "user" | "assistant";
content: string;
};
// Cache observability for the prompt-prefix caching that the Writer stable
// prefix relies on. The OpenAI usage object reports only cached READS
// (prompt_tokens_details.cached_tokens) and has no field for cache WRITES
// (tokens written to the cache on a cold pass), so unlike the old AI SDK
// path we can show the hit rate but not the create cost. cached_tokens lives
// directly on the SDK's CompletionUsage type — no cast needed.
function summarizeSdkUsage(
tag: string,
usage: OpenAI.Completions.CompletionUsage | undefined,
): string {
if (!usage) return `[cache] ${tag} no-usage`;
const input = usage.prompt_tokens ?? 0;
const output = usage.completion_tokens ?? 0;
const cached = usage.prompt_tokens_details?.cached_tokens;
if (typeof cached === "number") {
const rate = input > 0 ? ((cached / input) * 100).toFixed(1) : "n/a";
return `[cache] ${tag} hit=${cached} input=${input} rate=${rate}% completion=${output}`;
}
return `[cache] ${tag} input=${input} completion=${output} (provider didn't report cache stats)`;
}
export async function chat(
config: ProviderConfig,
messages: ChatMessage[],
opts?: {
temperature?: number;
tag?: string;
},
): Promise<string> {
const client = new OpenAI({
apiKey: config.apiKey,
baseURL: normalizeBaseUrl(config.baseUrl, "openai_compatible"),
maxRetries: 0,
dangerouslyAllowBrowser: true,
});
const completion = await client.chat.completions.create({
model: config.model,
messages: messages.map((m) => ({
role: m.role as "system" | "user" | "assistant",
content: m.content,
})),
temperature: opts?.temperature ?? 0.9,
stream: false,
});
const text = completion.choices[0]?.message?.content ?? "";
console.log(summarizeSdkUsage(opts?.tag ?? "chat", completion.usage ?? undefined));
if (text.length === 0) {
throw new Error(`Chat API returned no content.`);
}
return text;
}