Commit Graph

6 Commits

Author SHA1 Message Date
baizhi958216 ef3b57953b refactor(ai-client): replace AI SDK adapters with OpenAI SDK 2026-06-11 16:11:44 +08:00
yuanzonghao f4aca0b59c refactor(ai-client): extract shared createLanguageModel helper
De-duplicate the provider switch logic that was identical in chat.ts
and vision.ts into a shared model.ts module.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 11:55:55 +08:00
yuanzonghao 57bc6556ab refactor(ai-client): unify OpenAI-compatible path to AI SDK generateText
Eliminate the dual code path (raw fetch vs AI SDK) for text and vision.
All providers now go through createLanguageModel() + generateText(),
removing chatOpenAiCompatible/analyzeOpenAiCompatible, the manual Usage
type, summarizeUsage, and responseFormat plumbing from 8 call sites.

Key fix: @ai-sdk/openai v3 defaults to the Responses API (/responses);
DeepSeek only supports Chat Completions, so we use .chat() explicitly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-07 00:31:36 +08:00
yuanzonghao 83fd5717e7 feat(ai-client): multi-provider compat — native Anthropic/Google + URL tolerance
- TEXT/VISION: add native Anthropic & Google Gemini paths via Vercel AI SDK,
  selectable through TEXT_PROVIDER / VISION_PROVIDER (default openai_compatible)
- IMAGE: expand to openai (gpt-image) / google (Nano Banana) via AI SDK
  alongside the existing Runware task-array and OpenAI-compatible REST paths
- normalizeBaseUrl: tolerate URLs with/without /v1 (or /chat/completions);
  append the per-protocol version segment only for bare hosts
- config: readProvider() reads *_PROVIDER; types: ProviderProtocol + provider?
- deps: @ai-sdk/anthropic, @ai-sdk/google; docs in .env.example + README

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-04 17:09:05 +08:00
DESKTOP-I1T6TF3\Q 37c911f510 chore(engine): log prompt-cache hit/miss per chat call
Add a `tag` option to chat() and have it print one `[cache] <tag>
hit=X miss=Y rate=Z%` line per call. Three Usage-shape variants are
probed in order so the same logger works across providers:

  - DeepSeek (v3+):  usage.prompt_cache_hit_tokens / *_miss_tokens
  - OpenAI / o-series: usage.prompt_tokens_details.cached_tokens
  - Anthropic:        usage.cache_read_input_tokens / *_creation_*

When none of them are present (MiMo / local Ollama / others) we still
print prompt + completion totals so the cost baseline is visible.

Tag every callsite so the log is greppable:
  architect / writer / character-designer / cinematographer / insert-beat

This is the prerequisite for the prefix-cache reordering work that
follows — without per-agent visibility there's no way to tell if a
prompt rearrangement actually moved the needle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-03 10:42:33 +08:00
Zonghao Yuan dc5ecd60f6 refactor: flatten monorepo to single web package (#12)
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.

- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-03 00:55:45 +08:00