Initial commit: AI-driven visual novel scaffold

- Monorepo (pnpm workspace): apps/web + packages/{types,ai-client,engine}
- Next.js 16 web app with three-stage AI orchestration
- Three independently configurable providers: text LLM, image generator, vision model
- Warm minimalist editorial UI design
- One-click Vercel deploy ready

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-05-09 13:29:58 +08:00
commit cbd95bbea2
45 changed files with 1855 additions and 0 deletions
+21
View File
@@ -0,0 +1,21 @@
# =============================================================
# Dada — AI Visual Novel
# Three independently configurable AI providers
# Any OpenAI-compatible endpoint works (OpenAI, Anthropic, Gemini,
# OpenRouter, DeepSeek, Ollama, ...).
# =============================================================
# ---- 1. Text LLM (story director) -----------------------------
TEXT_BASE_URL=https://api.anthropic.com/v1
TEXT_API_KEY=sk-ant-xxx
TEXT_MODEL=claude-opus-4-7
# ---- 2. Image generator (renders the whole UI screen) ---------
IMAGE_BASE_URL=https://api.openai.com/v1
IMAGE_API_KEY=sk-xxx
IMAGE_MODEL=gpt-image-2
# ---- 3. Vision model (interprets where the user clicked) ------
VISION_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai
VISION_API_KEY=xxx
VISION_MODEL=gemini-3-flash
+21
View File
@@ -0,0 +1,21 @@
node_modules
.pnpm-store
.next
dist
build
out
*.tsbuildinfo
.env
.env.local
.env.*.local
.vercel
.turbo
.DS_Store
*.log
npm-debug.log*
pnpm-debug.log*
repomix-output.xml
+21
View File
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2026 Dada contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+93
View File
@@ -0,0 +1,93 @@
# Dada
> An AI-driven visual novel where every frame — scenes, dialogue, choices — is rendered by an AI, one frame at a time. You click. It paints. The story unfolds.
Open source, MIT.
---
## How it works
Each turn is three model calls:
```
[user clicks somewhere on the image]
1. Vision model interprets the click against the visible UI
2. Text LLM writes the next frame (narration, dialogue, choices)
3. Image model renders the entire next UI screen — scene, dialogue,
buttons, all of it — as one painted frame
[new image is shown; repeat]
```
There is no traditional UI. There is only the image. The AI chooses the layout, the colors, the typography, the buttons. Pick "stick figure on grid paper" as your style and you'll get hand-drawn UI. Pick "cyberpunk noir" and you'll get neon HUDs. Whatever fits the world.
---
## One-click deploy
[![Deploy with Vercel](https://vercel.com/button)](https://vercel.com/new/clone?repository-url=https://github.com/YOUR_USERNAME/dada&env=TEXT_BASE_URL,TEXT_API_KEY,TEXT_MODEL,IMAGE_BASE_URL,IMAGE_API_KEY,IMAGE_MODEL,VISION_BASE_URL,VISION_API_KEY,VISION_MODEL&envDescription=Three%20independently%20configurable%20providers.%20Any%20OpenAI-compatible%20endpoint%20works.&envLink=https://github.com/YOUR_USERNAME/dada%23environment-variables)
After deploy, set the nine environment variables (see below) in your Vercel project. That's it.
---
## Environment variables
Three providers, all independently configurable. Any OpenAI-compatible chat / image endpoint works (OpenAI, Anthropic via OpenAI-compat proxy, Gemini, OpenRouter, DeepSeek, local Ollama, …).
| Provider | Variables | Recommended |
|---|---|---|
| Text · story director | `TEXT_BASE_URL` `TEXT_API_KEY` `TEXT_MODEL` | `claude-opus-4-7` via Anthropic |
| Image · UI renderer | `IMAGE_BASE_URL` `IMAGE_API_KEY` `IMAGE_MODEL` | `gpt-image-2` via OpenAI |
| Vision · click reader | `VISION_BASE_URL` `VISION_API_KEY` `VISION_MODEL` | `gemini-3-flash` via Google |
See `.env.example` for the exact shape.
---
## Local development
Requires Node 20+ and pnpm 9+.
```bash
pnpm install
cp .env.example .env.local
# fill in the nine env vars
pnpm dev
# open http://localhost:3000
```
---
## Project layout
```
dada/
├── apps/web/ Next.js 16 app — pages + API routes
└── packages/
├── types/ shared TypeScript types
├── ai-client/ unified OpenAI-compatible clients
└── engine/ three-stage AI orchestration (open core)
```
`packages/engine` is the open core — pure TS, no Next.js or browser dependency. Import it directly to build your own visual-novel front-end (Tauri, Electron, CLI, anywhere).
---
## Cost & limits
Each turn costs roughly **\$0.150.25** in API fees with the recommended model trio. A 30-turn session is **\~\$58**. There is no rate limiting or auth out of the box — if you make your deployment public, your bill will reflect that. Add limits before sharing widely.
---
## License
MIT.
+32
View File
@@ -0,0 +1,32 @@
import { takeTurn } from "@dada/engine";
import type { InteractRequest } from "@dada/types";
import { NextResponse } from "next/server";
import { loadEngineConfig } from "@/lib/config";
export const runtime = "nodejs";
export const maxDuration = 60;
export async function POST(req: Request) {
let body: InteractRequest;
try {
body = (await req.json()) as InteractRequest;
} catch {
return NextResponse.json({ error: "Invalid JSON" }, { status: 400 });
}
if (!body.session || !body.prevImageBase64 || !body.click) {
return NextResponse.json(
{ error: "session, prevImageBase64, click are required" },
{ status: 400 },
);
}
try {
const config = loadEngineConfig();
const result = await takeTurn(config, body);
return NextResponse.json(result);
} catch (err) {
const message = err instanceof Error ? err.message : "Unknown error";
return NextResponse.json({ error: message }, { status: 500 });
}
}
+32
View File
@@ -0,0 +1,32 @@
import { startSession } from "@dada/engine";
import type { StartRequest } from "@dada/types";
import { NextResponse } from "next/server";
import { loadEngineConfig } from "@/lib/config";
export const runtime = "nodejs";
export const maxDuration = 60;
export async function POST(req: Request) {
let body: StartRequest;
try {
body = (await req.json()) as StartRequest;
} catch {
return NextResponse.json({ error: "Invalid JSON" }, { status: 400 });
}
if (!body.worldSetting?.trim() || !body.styleGuide?.trim()) {
return NextResponse.json(
{ error: "worldSetting and styleGuide are required" },
{ status: 400 },
);
}
try {
const config = loadEngineConfig();
const result = await startSession(config, body);
return NextResponse.json(result);
} catch (err) {
const message = err instanceof Error ? err.message : "Unknown error";
return NextResponse.json({ error: message }, { status: 500 });
}
}
+68
View File
@@ -0,0 +1,68 @@
@tailwind base;
@tailwind components;
@tailwind utilities;
@layer base {
html {
font-feature-settings: "ss01", "kern", "liga";
-webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
}
body {
background-image:
radial-gradient(rgba(133, 79, 37, 0.025) 1px, transparent 1px),
radial-gradient(rgba(133, 79, 37, 0.018) 1px, transparent 1px);
background-size: 28px 28px, 38px 38px;
background-position: 0 0, 14px 19px;
}
::selection {
background-color: rgb(217 122 46 / 0.28);
color: #2d1810;
}
textarea::placeholder {
color: rgb(168 105 59 / 0.45);
}
}
@layer utilities {
.hairline {
background-image: linear-gradient(
to right,
transparent,
rgba(45, 24, 16, 0.18) 18%,
rgba(45, 24, 16, 0.18) 82%,
transparent
);
height: 1px;
}
.hairline-full {
height: 1px;
background: rgba(45, 24, 16, 0.14);
}
.num {
font-variant-numeric: tabular-nums lining-nums;
}
.smallcaps {
text-transform: uppercase;
letter-spacing: 0.32em;
}
}
@keyframes dada-ripple {
0% {
width: 14px;
height: 14px;
opacity: 0.95;
}
100% {
width: 110px;
height: 110px;
opacity: 0;
}
}
+38
View File
@@ -0,0 +1,38 @@
import type { Metadata } from "next";
import "./globals.css";
export const metadata: Metadata = {
title: "Dada — AI Visual Novel",
description:
"An open-source visual novel where every frame is generated by AI.",
};
export default function RootLayout({
children,
}: {
children: React.ReactNode;
}) {
return (
<html lang="zh-CN">
<head>
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link
rel="preconnect"
href="https://fonts.gstatic.com"
crossOrigin=""
/>
<link
rel="stylesheet"
href="https://fonts.googleapis.com/css2?family=Cormorant+Garamond:ital,wght@0,300;0,400;0,500;0,600;1,300;1,400;1,500&family=Inter:wght@300;400;500&display=swap"
/>
<link
rel="stylesheet"
href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.1/css/all.min.css"
/>
</head>
<body className="bg-cream-50 text-clay-900 font-sans antialiased min-h-screen">
{children}
</body>
</html>
);
}
+58
View File
@@ -0,0 +1,58 @@
import Link from "next/link";
import { CustomForm } from "@/components/CustomForm";
export default function NewPage() {
return (
<div className="min-h-screen flex flex-col">
<header className="px-6 md:px-16 pt-7 md:pt-10 flex items-center justify-between">
<Link
href="/"
className="text-[10px] smallcaps text-clay-700 hover:text-clay-900 transition-colors flex items-center gap-2"
>
<i className="fa-solid fa-arrow-left text-[9px]" />
Dada
</Link>
<span className="text-[10px] smallcaps text-clay-500">
Compose · a · world
</span>
</header>
<section className="px-6 md:px-16 pt-20 md:pt-32 pb-20 md:pb-24 flex-1">
<div className="grid grid-cols-12 gap-8 md:gap-16 max-w-6xl">
<div className="col-span-12 md:col-span-4 animate-fade-in">
<p className="text-[10px] smallcaps text-clay-500 mb-6">
· Untitled
</p>
<h1 className="font-serif text-[44px] md:text-[64px] text-clay-900 leading-[0.96] mb-8">
Write
<br />
<em className="italic text-clay-600">two</em>
<br />
paragraphs.
</h1>
<div className="hairline w-12 mb-6" />
<p className="font-serif text-base text-clay-700 leading-[1.7]">
The first sketches the world your story unfolds in. The second
describes how the world should look its medium, its mood, its
grain.
</p>
<p className="font-serif italic text-sm text-clay-500 mt-5 leading-relaxed">
Both fields accept any language. Specificity rewards specificity.
</p>
</div>
<div className="col-span-12 md:col-span-7 md:col-start-6">
<CustomForm />
</div>
</div>
</section>
<footer className="px-6 md:px-16 pb-8">
<div className="hairline-full w-full mb-4" />
<div className="flex items-center justify-between text-[10px] smallcaps text-clay-500">
<span>MIT · MMXXVI</span>
<span className="num"> · </span>
</div>
</footer>
</div>
);
}
+159
View File
@@ -0,0 +1,159 @@
import Link from "next/link";
import { PRESETS } from "@/lib/presets";
import { PresetCard } from "@/components/PresetCard";
const ORDINALS = ["", "Ⅱ", "Ⅲ", "Ⅳ"];
export default function LandingPage() {
return (
<div className="min-h-screen flex flex-col">
<header className="px-6 md:px-16 pt-7 md:pt-10 flex items-center justify-between">
<div className="flex items-center gap-4">
<span className="text-[10px] smallcaps text-clay-700 font-medium">
Dada
</span>
<span className="hairline w-10 hidden md:block" />
<span className="text-[10px] smallcaps text-clay-500 hidden md:block">
Frame · by · Frame
</span>
</div>
<nav className="flex items-center gap-5 text-[10px] smallcaps text-clay-600">
<a
href="https://github.com"
className="hover:text-clay-900 transition-colors"
>
GitHub
</a>
<span className="text-clay-300">·</span>
<a href="#about" className="hover:text-clay-900 transition-colors">
About
</a>
</nav>
</header>
<section className="px-6 md:px-16 pt-20 md:pt-36 pb-20 md:pb-28">
<div className="grid grid-cols-12 gap-8">
<div className="col-span-12 md:col-span-7 animate-fade-in">
<p className="text-[10px] smallcaps text-clay-500 mb-8">
An open-source experiment · MMXXVI
</p>
<h1 className="font-serif font-light text-[56px] md:text-[104px] leading-[0.94] text-clay-900 tracking-tight">
Every{" "}
<em className="italic font-light text-clay-600">frame</em>
<br />
is painted on
<br />
<span className="text-ember-500 italic font-light">demand.</span>
</h1>
<p className="mt-10 md:mt-14 max-w-md font-serif text-lg md:text-xl text-clay-700 leading-[1.65]">
Dada is a visual novel where the <em>entire</em> interface scene,
dialogue, choices is rendered by an AI, one frame at a time. You
click. It paints. The story unfolds.
</p>
</div>
<aside className="col-span-12 md:col-span-4 md:col-start-9 mt-8 md:mt-0 flex md:items-end">
<div className="space-y-3">
<div className="hairline w-12" />
<p className="font-serif italic text-clay-600 text-base md:text-[17px] leading-relaxed max-w-[280px]">
&ldquo;It is impossible to step into the same river twice.
</p>
<p className="font-serif italic text-clay-600 text-base md:text-[17px] leading-relaxed max-w-[280px]">
It is impossible to play the same Dada twice.&rdquo;
</p>
<p className="text-[10px] smallcaps text-clay-500 pt-2">
README · v0.1
</p>
</div>
</aside>
</div>
</section>
<div className="px-6 md:px-16">
<div className="hairline-full w-full" />
</div>
<section className="px-6 md:px-16 pt-14 md:pt-20 pb-16 md:pb-24">
<div className="flex items-baseline justify-between mb-8 md:mb-10">
<h2 className="text-[10px] smallcaps text-clay-700 font-medium">
Four Doors
</h2>
<p className="text-[10px] smallcaps text-clay-500 hidden md:block">
Choose a world · or compose your own
</p>
</div>
<div className="grid grid-cols-1">
{PRESETS.map((p, i) => (
<PresetCard key={p.id} preset={p} ordinal={ORDINALS[i]!} />
))}
<Link
href="/new"
className="group block w-full py-10 md:py-12 border-t border-b border-clay-900/10 hover:border-clay-900/35 transition-[border-color] duration-500"
>
<div className="flex items-baseline gap-6 md:gap-10">
<span className="font-serif italic text-2xl md:text-3xl text-clay-400 group-hover:text-clay-700 transition-colors duration-500 w-8 shrink-0">
{ORDINALS[3]}
</span>
<div className="flex-1 min-w-0">
<h3 className="font-serif text-3xl md:text-4xl text-clay-900 leading-tight mb-2.5">
Untitled
</h3>
<p className="text-sm text-clay-600 leading-relaxed max-w-md">
Bring your own world. Describe it in your own words.
</p>
</div>
<span className="hidden md:flex items-center gap-3 text-[10px] tracking-[0.4em] text-clay-400 group-hover:text-ember-500 transition-colors duration-500 shrink-0 self-center">
COMPOSE
<span className="w-7 h-px bg-current transition-all duration-500 group-hover:w-12" />
</span>
</div>
</Link>
</div>
</section>
<section
id="about"
className="px-6 md:px-16 pb-20 md:pb-28 grid grid-cols-12 gap-8"
>
<div className="col-span-12 md:col-span-3">
<p className="text-[10px] smallcaps text-clay-500 mb-3">
Colophon · I
</p>
<p className="font-serif italic text-clay-700 text-base leading-relaxed">
A small open-source experiment in generative narrative. Self-host on
Vercel in a single click.
</p>
</div>
<div className="col-span-12 md:col-span-3 md:col-start-5">
<p className="text-[10px] smallcaps text-clay-500 mb-3">
Colophon · II
</p>
<ul className="font-serif text-clay-700 text-base leading-relaxed space-y-1">
<li>Story · large language model</li>
<li>Image · generative renderer</li>
<li>Click · vision interpreter</li>
</ul>
</div>
<div className="col-span-12 md:col-span-3 md:col-start-9">
<p className="text-[10px] smallcaps text-clay-500 mb-3">
Colophon · III
</p>
<p className="font-serif italic text-clay-700 text-base leading-relaxed">
All three are configured separately bring any OpenAI-compatible
endpoint.
</p>
</div>
</section>
<footer className="px-6 md:px-16 pb-10 mt-auto">
<div className="hairline-full w-full mb-5" />
<div className="flex items-center justify-between text-[10px] smallcaps text-clay-500">
<span>MIT · MMXXVI</span>
<span className="num"> · </span>
</div>
</footer>
</div>
);
}
+235
View File
@@ -0,0 +1,235 @@
"use client";
import Link from "next/link";
import { useRouter, useSearchParams } from "next/navigation";
import { Suspense, useEffect, useRef, useState } from "react";
import { PlayCanvas, type Phase } from "@/components/PlayCanvas";
import { PRESETS } from "@/lib/presets";
import type {
ClickIntent,
InteractResponse,
Session,
StartResponse,
StoryFrame,
} from "@dada/types";
function PlayInner() {
const router = useRouter();
const params = useSearchParams();
const [phase, setPhase] = useState<Phase>("loading-first");
const [session, setSession] = useState<Session | null>(null);
const [imageBase64, setImageBase64] = useState<string | null>(null);
const [frame, setFrame] = useState<StoryFrame | null>(null);
const [intent, setIntent] = useState<ClickIntent | null>(null);
const [pendingClick, setPendingClick] = useState<{
x: number;
y: number;
} | null>(null);
const [turnNum, setTurnNum] = useState(0);
const [error, setError] = useState<string | null>(null);
const startedRef = useRef(false);
useEffect(() => {
if (startedRef.current) return;
startedRef.current = true;
let payload: { worldSetting: string; styleGuide: string } | null = null;
const presetId = params.get("preset");
if (presetId) {
const p = PRESETS.find((x) => x.id === presetId);
if (p) {
payload = { worldSetting: p.worldSetting, styleGuide: p.styleGuide };
}
} else if (params.get("custom") === "1") {
const stored = sessionStorage.getItem("dada:custom");
if (stored) {
try {
payload = JSON.parse(stored);
} catch {
payload = null;
}
}
}
if (!payload) {
router.replace("/");
return;
}
const finalPayload = payload;
fetch("/api/start", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(finalPayload),
})
.then(async (r) => {
if (!r.ok) {
const j = (await r.json().catch(() => ({}))) as { error?: string };
throw new Error(j.error ?? r.statusText);
}
return r.json() as Promise<StartResponse>;
})
.then((data) => {
setSession({
id: data.sessionId,
createdAt: Date.now(),
worldSetting: finalPayload.worldSetting,
styleGuide: finalPayload.styleGuide,
history: [{ frame: data.frame }],
});
setFrame(data.frame);
setImageBase64(data.imageBase64);
setPhase("ready");
setTurnNum(1);
})
.catch((e) => setError(String(e)));
}, [params, router]);
async function handleClick(click: { x: number; y: number }) {
if (phase !== "ready" || !session || !imageBase64) return;
setPhase("interacting");
setPendingClick(click);
setIntent(null);
try {
const res = await fetch("/api/interact", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
session,
prevImageBase64: imageBase64,
click,
}),
});
if (!res.ok) {
const j = (await res.json().catch(() => ({}))) as { error?: string };
throw new Error(j.error ?? res.statusText);
}
const data = (await res.json()) as InteractResponse;
const updatedHistory = [
...data.session.history,
{ frame: data.frame },
];
setSession({ ...data.session, history: updatedHistory });
setFrame(data.frame);
setImageBase64(data.imageBase64);
setIntent(data.intent);
setPendingClick(null);
setTurnNum((t) => t + 1);
setPhase("ready");
} catch (e) {
setError(String(e));
setPendingClick(null);
setPhase("ready");
}
}
if (error) {
return (
<div className="min-h-screen flex flex-col items-center justify-center px-8">
<div className="max-w-md text-center animate-fade-in">
<p className="text-[10px] smallcaps text-clay-500 mb-6">
An · error · occurred
</p>
<p className="font-serif italic text-clay-900 text-lg leading-[1.7] mb-10">
{error}
</p>
<Link
href="/"
className="text-[10px] smallcaps text-clay-700 hover:text-ember-500 transition-colors inline-flex items-center gap-3"
>
<i className="fa-solid fa-arrow-left text-[9px]" />
Return
</Link>
</div>
</div>
);
}
return (
<div className="min-h-screen flex flex-col">
<header className="px-5 md:px-12 pt-6 md:pt-8 flex items-center justify-between">
<Link
href="/"
className="text-[10px] smallcaps text-clay-600 hover:text-clay-900 transition-colors flex items-center gap-2"
>
<i className="fa-solid fa-arrow-left text-[9px]" />
Dada
</Link>
<div className="flex items-center gap-3 text-[10px] smallcaps text-clay-500 num">
<span>Frame · {String(turnNum).padStart(3, "0")}</span>
<span className="text-clay-300">·</span>
<span className="hidden sm:inline truncate max-w-[180px]">
{session?.id.slice(2, 14) ?? "—"}
</span>
</div>
</header>
<main className="flex-1 flex flex-col items-center justify-center px-4 md:px-8 py-6 md:py-10">
<PlayCanvas
imageBase64={imageBase64}
phase={phase}
pendingClick={pendingClick}
onClick={handleClick}
/>
<div className="mt-7 md:mt-9 max-w-md w-full text-center min-h-[64px] flex items-center justify-center">
{phase === "loading-first" && (
<p className="text-[10px] smallcaps text-clay-500 animate-slow-pulse">
Summoning · the · first · frame
</p>
)}
{phase === "interacting" && (
<div className="flex flex-col items-center gap-2 animate-fade-in">
<p className="text-[10px] smallcaps text-clay-500 animate-slow-pulse">
AI · is · painting · the · next · moment
</p>
<p className="font-serif italic text-clay-400 text-xs">
this usually takes 1220 seconds
</p>
</div>
)}
{phase === "ready" && intent?.targetLabel && (
<p className="font-serif italic text-clay-500 text-base leading-relaxed animate-fade-in max-w-[320px]">
<span className="text-[9px] smallcaps not-italic text-clay-400 mr-2 align-middle">
Last · move ·
</span>
<span className="align-middle">{intent.targetLabel}</span>
</p>
)}
{phase === "ready" && !intent && turnNum > 0 && (
<p className="text-[10px] smallcaps text-clay-400 animate-fade-in">
Click · anywhere · to · respond
</p>
)}
</div>
</main>
<footer className="px-5 md:px-12 pb-6">
<div className="text-[9px] smallcaps text-clay-400 text-center num">
·
</div>
</footer>
</div>
);
}
export default function PlayPage() {
return (
<Suspense
fallback={
<div className="min-h-screen flex items-center justify-center">
<span className="text-[10px] smallcaps text-clay-500 animate-slow-pulse">
Loading
</span>
</div>
}
>
<PlayInner />
</Suspense>
);
}
+92
View File
@@ -0,0 +1,92 @@
"use client";
import { useRouter } from "next/navigation";
import { useState } from "react";
export function CustomForm() {
const router = useRouter();
const [worldSetting, setWorldSetting] = useState("");
const [styleGuide, setStyleGuide] = useState("");
const [submitting, setSubmitting] = useState(false);
const canSubmit =
worldSetting.trim().length > 10 &&
styleGuide.trim().length > 5 &&
!submitting;
function handleSubmit(e: React.FormEvent) {
e.preventDefault();
if (!canSubmit) return;
setSubmitting(true);
sessionStorage.setItem(
"dada:custom",
JSON.stringify({ worldSetting, styleGuide }),
);
router.push("/play?custom=1");
}
return (
<form onSubmit={handleSubmit} className="space-y-12 animate-fade-in">
<div>
<label className="flex items-baseline justify-between mb-4">
<span className="text-[10px] smallcaps text-clay-700 font-medium">
<span className="text-clay-400 mr-2 font-serif italic not-italic font-normal">
</span>
World ·
</span>
<span className="text-[10px] text-clay-400 num">
{worldSetting.length}
</span>
</label>
<textarea
value={worldSetting}
onChange={(e) => setWorldSetting(e.target.value)}
rows={6}
placeholder="例:1990 年代末的中国南方县城。主角是高三转学生,在多雨的六月遇到一个总在天台读诗的同学。剧情慢热、含蓄、带点伤感⋯"
className="w-full bg-transparent border-0 border-b border-clay-900/20 px-0 py-3 text-clay-900 font-serif text-lg leading-[1.7] focus:outline-none focus:border-clay-700 transition-colors resize-none placeholder:font-serif placeholder:italic placeholder:text-base placeholder:leading-[1.7]"
/>
</div>
<div>
<label className="flex items-baseline justify-between mb-4">
<span className="text-[10px] smallcaps text-clay-700 font-medium">
<span className="text-clay-400 mr-2 font-serif italic not-italic font-normal">
</span>
Style ·
</span>
<span className="text-[10px] text-clay-400 num">
{styleGuide.length}
</span>
</label>
<textarea
value={styleGuide}
onChange={(e) => setStyleGuide(e.target.value)}
rows={4}
placeholder="例:Soft watercolor, warm afternoon light, anime visual novel style, classic dialogue panel⋯"
className="w-full bg-transparent border-0 border-b border-clay-900/20 px-0 py-3 text-clay-900 font-serif text-lg leading-[1.7] focus:outline-none focus:border-clay-700 transition-colors resize-none placeholder:font-serif placeholder:italic placeholder:text-base placeholder:leading-[1.7]"
/>
</div>
<div className="pt-6 flex items-center justify-between">
<span className="text-[10px] smallcaps text-clay-500">
{submitting
? "Summoning the first frame…"
: canSubmit
? "Ready when you are"
: "Two paragraphs · enough to begin"}
</span>
<button
type="submit"
disabled={!canSubmit}
className="group flex items-center gap-3 text-[10px] smallcaps text-clay-900 disabled:text-clay-300 disabled:cursor-not-allowed enabled:hover:text-ember-500 transition-colors duration-300"
>
Begin
<span className="w-10 h-px bg-current transition-all duration-300 group-enabled:group-hover:w-16" />
<i className="fa-solid fa-arrow-right text-[9px]" />
</button>
</div>
</form>
);
}
+106
View File
@@ -0,0 +1,106 @@
"use client";
import { useRef } from "react";
export type Phase = "loading-first" | "ready" | "interacting";
export function PlayCanvas({
imageBase64,
phase,
pendingClick,
onClick,
}: {
imageBase64: string | null;
phase: Phase;
pendingClick: { x: number; y: number } | null;
onClick: (click: { x: number; y: number }) => void;
}) {
const ref = useRef<HTMLDivElement>(null);
function handleClick(e: React.MouseEvent<HTMLDivElement>) {
if (phase !== "ready" || !ref.current || !imageBase64) return;
const rect = ref.current.getBoundingClientRect();
const x = (e.clientX - rect.left) / rect.width;
const y = (e.clientY - rect.top) / rect.height;
onClick({
x: Math.max(0, Math.min(1, x)),
y: Math.max(0, Math.min(1, y)),
});
}
const interactive = phase === "ready" && !!imageBase64;
const dimmed = phase === "interacting";
return (
<div className="w-full max-w-[440px] mx-auto">
<div
ref={ref}
onClick={handleClick}
className={`relative aspect-[2/3] w-full overflow-hidden bg-cream-200 select-none ${interactive ? "cursor-pointer" : "cursor-wait"}`}
style={{
boxShadow:
"0 1px 0 rgba(45,24,16,0.05), 0 36px 64px -28px rgba(45,24,16,0.25), 0 8px 18px -6px rgba(45,24,16,0.10)",
}}
>
{imageBase64 ? (
<img
key={imageBase64.slice(-48)}
src={`data:image/png;base64,${imageBase64}`}
alt="Generated frame"
className={`absolute inset-0 w-full h-full object-cover animate-fade-in transition-opacity duration-700 ease-out ${dimmed ? "opacity-30" : "opacity-100"}`}
draggable={false}
/>
) : (
<div className="absolute inset-0 flex flex-col items-center justify-center gap-4">
<div className="w-1.5 h-1.5 bg-clay-500 rounded-full animate-slow-pulse" />
<p className="text-[9px] smallcaps text-clay-500 animate-slow-pulse">
Painting · the · first · frame
</p>
</div>
)}
<div className="absolute inset-x-0 top-0 h-12 bg-gradient-to-b from-clay-900/15 to-transparent pointer-events-none" />
<div className="absolute inset-x-0 bottom-0 h-12 bg-gradient-to-t from-clay-900/15 to-transparent pointer-events-none" />
{pendingClick && (
<>
<div
className="absolute rounded-full border border-ember-500 pointer-events-none"
style={{
left: `${pendingClick.x * 100}%`,
top: `${pendingClick.y * 100}%`,
transform: "translate(-50%, -50%)",
width: 30,
height: 30,
animation:
"dada-ripple 1.6s cubic-bezier(0.16,1,0.3,1) infinite",
}}
/>
<div
className="absolute rounded-full pointer-events-none"
style={{
left: `${pendingClick.x * 100}%`,
top: `${pendingClick.y * 100}%`,
transform: "translate(-50%, -50%)",
width: 11,
height: 11,
background: "#D97A2E",
boxShadow:
"0 0 0 3px rgba(251,247,240,0.95), 0 0 14px rgba(217,122,46,0.55)",
}}
/>
</>
)}
</div>
<div className="flex items-center justify-between mt-3 px-1">
<span className="text-[9px] smallcaps text-clay-400 num">
1024 × 1536 · png
</span>
<span className="text-[9px] smallcaps text-clay-400">
{phase === "ready" ? "Tap · anywhere" : "···"}
</span>
</div>
</div>
);
}
+38
View File
@@ -0,0 +1,38 @@
"use client";
import { useRouter } from "next/navigation";
import type { Preset } from "@/lib/presets";
export function PresetCard({
preset,
ordinal,
}: {
preset: Preset;
ordinal: string;
}) {
const router = useRouter();
return (
<button
onClick={() => router.push(`/play?preset=${preset.id}`)}
className="group block w-full py-10 md:py-12 border-t border-clay-900/10 hover:border-clay-900/35 transition-[border-color,padding] duration-500 text-left"
>
<div className="flex items-baseline gap-6 md:gap-10">
<span className="font-serif italic text-2xl md:text-3xl text-clay-400 group-hover:text-clay-700 transition-colors duration-500 w-8 shrink-0">
{ordinal}
</span>
<div className="flex-1 min-w-0">
<h3 className="font-serif text-3xl md:text-4xl text-clay-900 leading-tight mb-2.5">
{preset.title}
</h3>
<p className="text-sm text-clay-600 leading-relaxed max-w-md">
{preset.blurb}
</p>
</div>
<span className="hidden md:flex items-center gap-3 text-[10px] tracking-[0.4em] text-clay-400 group-hover:text-ember-500 transition-colors duration-500 shrink-0 self-center">
ENTER
<span className="w-7 h-px bg-current transition-all duration-500 group-hover:w-12" />
</span>
</div>
</button>
);
}
+27
View File
@@ -0,0 +1,27 @@
import type { EngineConfig } from "@dada/types";
function readVar(name: string): string {
const v = process.env[name];
if (!v) throw new Error(`Missing required environment variable: ${name}`);
return v;
}
export function loadEngineConfig(): EngineConfig {
return {
text: {
baseUrl: readVar("TEXT_BASE_URL"),
apiKey: readVar("TEXT_API_KEY"),
model: readVar("TEXT_MODEL"),
},
image: {
baseUrl: readVar("IMAGE_BASE_URL"),
apiKey: readVar("IMAGE_API_KEY"),
model: readVar("IMAGE_MODEL"),
},
vision: {
baseUrl: readVar("VISION_BASE_URL"),
apiKey: readVar("VISION_API_KEY"),
model: readVar("VISION_MODEL"),
},
};
}
+37
View File
@@ -0,0 +1,37 @@
export type Preset = {
id: string;
title: string;
blurb: string;
worldSetting: string;
styleGuide: string;
};
export const PRESETS: Preset[] = [
{
id: "highschool",
title: "六月雨季",
blurb: "县城高中,转学生,未送出的伞。",
worldSetting:
"故事发生在 1990 年代末的中国南方县城高中。主角是高三转学生,在多雨的六月遇到一个总在天台读诗的同学。剧情慢热、含蓄、带点伤感。",
styleGuide:
"Anime visual novel style, soft watercolor lighting, warm afternoon palette, classic Japanese galgame dialogue panel.",
},
{
id: "cyberpunk",
title: "雨夜霓虹",
blurb: "失忆的私家侦探,一通陌生来电。",
worldSetting:
"2087 年的雨夜东亚特区。主角是一个刚从昏迷中醒来、丢失了三天记忆的私家侦探。他的电话响了,对面是一个声称认识他的女人。",
styleGuide:
"Cinematic cyberpunk realism, neon reflections on wet streets, blade-runner palette, transparent neon HUD interface elements.",
},
{
id: "stickfigure",
title: "火柴人冒险",
blurb: "一支铅笔,一个世界,全靠涂改。",
worldSetting:
"你是一个用铅笔画在格子本上的火柴人,刚意识到自己活在一个学生的草稿纸里。本子的边缘正在被橡皮擦逐渐抹去,你必须想办法逃出去。",
styleGuide:
"Hand-drawn pencil sketch on grid paper, stick figures, rough doodle UI elements, eraser smudges, notebook aesthetic.",
},
];
+4
View File
@@ -0,0 +1,4 @@
/// <reference types="next" />
/// <reference types="next/image-types/global" />
// NOTE: This file should not be edited
+14
View File
@@ -0,0 +1,14 @@
import type { NextConfig } from "next";
const config: NextConfig = {
reactStrictMode: true,
transpilePackages: ["@dada/engine", "@dada/ai-client", "@dada/types"],
serverExternalPackages: ["sharp"],
experimental: {
serverActions: {
bodySizeLimit: "10mb",
},
},
};
export default config;
+31
View File
@@ -0,0 +1,31 @@
{
"name": "@dada/web",
"version": "0.1.0",
"private": true,
"type": "module",
"scripts": {
"dev": "next dev",
"build": "next build",
"start": "next start",
"lint": "next lint",
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@dada/ai-client": "workspace:*",
"@dada/engine": "workspace:*",
"@dada/types": "workspace:*",
"next": "^16.0.0",
"react": "^19.0.0",
"react-dom": "^19.0.0",
"sharp": "^0.33.5"
},
"devDependencies": {
"@types/node": "^22.9.0",
"@types/react": "^19.0.0",
"@types/react-dom": "^19.0.0",
"autoprefixer": "^10.4.20",
"postcss": "^8.4.49",
"tailwindcss": "^3.4.15",
"typescript": "^5.6.3"
}
}
+6
View File
@@ -0,0 +1,6 @@
export default {
plugins: {
tailwindcss: {},
autoprefixer: {},
},
};
+57
View File
@@ -0,0 +1,57 @@
import type { Config } from "tailwindcss";
const config: Config = {
content: ["./app/**/*.{ts,tsx}", "./components/**/*.{ts,tsx}"],
theme: {
extend: {
colors: {
cream: {
50: "#FBF7F0",
100: "#F5EFE3",
200: "#EBE0CB",
300: "#DCC9A8",
},
clay: {
400: "#C68B5C",
500: "#A8693B",
600: "#854F25",
700: "#5E371A",
900: "#2D1810",
},
ember: {
400: "#E89B5C",
500: "#D97A2E",
},
},
fontFamily: {
serif: ['"Cormorant Garamond"', '"Source Han Serif SC"', "ui-serif", "Georgia", "serif"],
sans: ['"Inter"', '"PingFang SC"', "ui-sans-serif", "system-ui", "sans-serif"],
},
letterSpacing: {
widest: "0.32em",
},
animation: {
"fade-in": "fadeIn 0.6s ease-out",
"slow-pulse": "slowPulse 2.6s ease-in-out infinite",
"drift": "drift 12s ease-in-out infinite",
},
keyframes: {
fadeIn: {
"0%": { opacity: "0", transform: "translateY(8px)" },
"100%": { opacity: "1", transform: "translateY(0)" },
},
slowPulse: {
"0%, 100%": { opacity: "0.55" },
"50%": { opacity: "1" },
},
drift: {
"0%, 100%": { transform: "translate(0, 0)" },
"50%": { transform: "translate(0, -10px)" },
},
},
},
},
plugins: [],
};
export default config;
+13
View File
@@ -0,0 +1,13 @@
{
"extends": "../../tsconfig.base.json",
"compilerOptions": {
"noEmit": true,
"incremental": true,
"plugins": [{ "name": "next" }],
"paths": {
"@/*": ["./*"]
}
},
"include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", ".next/types/**/*.ts"],
"exclude": ["node_modules"]
}
+21
View File
@@ -0,0 +1,21 @@
{
"name": "dada",
"version": "0.1.0",
"private": true,
"description": "AI-driven visual novel — open source",
"license": "MIT",
"packageManager": "pnpm@9.12.0",
"engines": {
"node": ">=20"
},
"scripts": {
"dev": "pnpm --filter @dada/web dev",
"build": "pnpm --filter @dada/web build",
"start": "pnpm --filter @dada/web start",
"lint": "pnpm -r lint",
"typecheck": "pnpm -r typecheck"
},
"devDependencies": {
"typescript": "^5.6.3"
}
}
+17
View File
@@ -0,0 +1,17 @@
{
"name": "@dada/ai-client",
"version": "0.1.0",
"private": true,
"type": "module",
"main": "./src/index.ts",
"types": "./src/index.ts",
"exports": {
".": "./src/index.ts"
},
"scripts": {
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@dada/types": "workspace:*"
}
}
+41
View File
@@ -0,0 +1,41 @@
import type { ProviderConfig } from "@dada/types";
export type ChatMessage = {
role: "system" | "user" | "assistant";
content: string;
};
export async function chat(
config: ProviderConfig,
messages: ChatMessage[],
opts?: { temperature?: number; responseFormat?: "json_object" | "text" },
): Promise<string> {
const url = `${config.baseUrl.replace(/\/$/, "")}/chat/completions`;
const body: Record<string, unknown> = {
model: config.model,
messages,
temperature: opts?.temperature ?? 0.9,
};
if (opts?.responseFormat === "json_object") {
body.response_format = { type: "json_object" };
}
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${config.apiKey}`,
},
body: JSON.stringify(body),
});
if (!res.ok) {
const text = await res.text();
throw new Error(`Chat API error ${res.status}: ${text}`);
}
const json = (await res.json()) as {
choices: { message: { content: string } }[];
};
return json.choices[0]?.message.content ?? "";
}
+44
View File
@@ -0,0 +1,44 @@
import type { ProviderConfig } from "@dada/types";
export async function generateImage(
config: ProviderConfig,
prompt: string,
opts?: { size?: string; quality?: "low" | "medium" | "high" | "auto" },
): Promise<string> {
const url = `${config.baseUrl.replace(/\/$/, "")}/images/generations`;
const body: Record<string, unknown> = {
model: config.model,
prompt,
size: opts?.size ?? "1024x1536",
quality: opts?.quality ?? "medium",
n: 1,
};
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${config.apiKey}`,
},
body: JSON.stringify(body),
});
if (!res.ok) {
const text = await res.text();
throw new Error(`Image API error ${res.status}: ${text}`);
}
const json = (await res.json()) as {
data: { b64_json?: string; url?: string }[];
};
const item = json.data[0];
if (!item) throw new Error("Image API returned no data");
if (item.b64_json) return item.b64_json;
if (item.url) {
const imgRes = await fetch(item.url);
const buf = await imgRes.arrayBuffer();
return Buffer.from(buf).toString("base64");
}
throw new Error("Image API returned neither b64_json nor url");
}
+4
View File
@@ -0,0 +1,4 @@
export { chat } from "./chat";
export { generateImage } from "./image";
export { interpretClick } from "./vision";
export type { ChatMessage } from "./chat";
+46
View File
@@ -0,0 +1,46 @@
import type { ProviderConfig } from "@dada/types";
export async function interpretClick(
config: ProviderConfig,
imageBase64: string,
prompt: string,
): Promise<string> {
const url = `${config.baseUrl.replace(/\/$/, "")}/chat/completions`;
const body = {
model: config.model,
messages: [
{
role: "user",
content: [
{ type: "text", text: prompt },
{
type: "image_url",
image_url: { url: `data:image/png;base64,${imageBase64}` },
},
],
},
],
temperature: 0.2,
response_format: { type: "json_object" },
};
const res = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${config.apiKey}`,
},
body: JSON.stringify(body),
});
if (!res.ok) {
const text = await res.text();
throw new Error(`Vision API error ${res.status}: ${text}`);
}
const json = (await res.json()) as {
choices: { message: { content: string } }[];
};
return json.choices[0]?.message.content ?? "";
}
+7
View File
@@ -0,0 +1,7 @@
{
"extends": "../../tsconfig.base.json",
"compilerOptions": {
"noEmit": true
},
"include": ["src/**/*"]
}
+19
View File
@@ -0,0 +1,19 @@
{
"name": "@dada/engine",
"version": "0.1.0",
"private": true,
"type": "module",
"main": "./src/index.ts",
"types": "./src/index.ts",
"exports": {
".": "./src/index.ts"
},
"scripts": {
"typecheck": "tsc --noEmit"
},
"dependencies": {
"@dada/ai-client": "workspace:*",
"@dada/types": "workspace:*",
"sharp": "^0.33.5"
}
}
+30
View File
@@ -0,0 +1,30 @@
import sharp from "sharp";
export async function annotateClick(
imageBase64: string,
click: { x: number; y: number },
): Promise<string> {
const buf = Buffer.from(imageBase64, "base64");
const meta = await sharp(buf).metadata();
const w = meta.width ?? 1024;
const h = meta.height ?? 1536;
const cx = Math.round(click.x * w);
const cy = Math.round(click.y * h);
const r = Math.round(Math.min(w, h) * 0.025);
const stroke = Math.max(3, Math.round(r * 0.25));
const svg = `<svg xmlns="http://www.w3.org/2000/svg" width="${w}" height="${h}">
<circle cx="${cx}" cy="${cy}" r="${r}" fill="rgba(255,40,40,0.55)"
stroke="rgba(255,255,255,0.95)" stroke-width="${stroke}" />
<circle cx="${cx}" cy="${cy}" r="${Math.round(r * 0.25)}"
fill="rgba(255,255,255,1)" />
</svg>`;
const out = await sharp(buf)
.composite([{ input: Buffer.from(svg), top: 0, left: 0 }])
.png()
.toBuffer();
return out.toString("base64");
}
+37
View File
@@ -0,0 +1,37 @@
import { chat } from "@dada/ai-client";
import type { ProviderConfig, Session, StoryFrame, UIElement } from "@dada/types";
import { parseJsonLoose } from "./jsonParser";
import { DIRECTOR_SYSTEM, buildDirectorUserMessage } from "./prompts";
type DirectorOutput = {
narration?: string;
speaker?: string;
line?: string;
scenePrompt: string;
uiElements: UIElement[];
};
export async function direct(
config: ProviderConfig,
session: Session,
): Promise<StoryFrame> {
const raw = await chat(
config,
[
{ role: "system", content: DIRECTOR_SYSTEM },
{ role: "user", content: buildDirectorUserMessage(session) },
],
{ temperature: 0.9, responseFormat: "json_object" },
);
const parsed = parseJsonLoose<DirectorOutput>(raw);
return {
id: `frame_${Date.now()}`,
narration: parsed.narration?.trim() || undefined,
speaker: parsed.speaker?.trim() || undefined,
line: parsed.line?.trim() || undefined,
scenePrompt: parsed.scenePrompt,
uiElements: parsed.uiElements ?? [],
};
}
+3
View File
@@ -0,0 +1,3 @@
export { startSession, takeTurn } from "./orchestrator";
export { annotateClick } from "./annotate";
export * from "./prompts";
+27
View File
@@ -0,0 +1,27 @@
export function parseJsonLoose<T>(raw: string): T {
const trimmed = raw.trim();
try {
return JSON.parse(trimmed) as T;
} catch {
// fall through
}
const fenced = trimmed.match(/```(?:json)?\s*([\s\S]*?)\s*```/);
if (fenced?.[1]) {
try {
return JSON.parse(fenced[1]) as T;
} catch {
// fall through
}
}
const first = trimmed.indexOf("{");
const last = trimmed.lastIndexOf("}");
if (first !== -1 && last > first) {
const slice = trimmed.slice(first, last + 1);
return JSON.parse(slice) as T;
}
throw new Error(`Failed to parse JSON from model output: ${raw.slice(0, 200)}`);
}
+71
View File
@@ -0,0 +1,71 @@
import type {
EngineConfig,
InteractRequest,
InteractResponse,
Session,
StartRequest,
StartResponse,
} from "@dada/types";
import { annotateClick } from "./annotate";
import { direct } from "./director";
import { render } from "./renderer";
import { interpret } from "./vision";
function newSessionId(): string {
return `s_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`;
}
export async function startSession(
config: EngineConfig,
req: StartRequest,
): Promise<StartResponse> {
const session: Session = {
id: newSessionId(),
createdAt: Date.now(),
worldSetting: req.worldSetting.trim(),
styleGuide: req.styleGuide.trim(),
history: [],
};
const frame = await direct(config.text, session);
const imageBase64 = await render(config.image, frame, session.styleGuide);
return {
sessionId: session.id,
frame,
imageBase64,
};
}
export async function takeTurn(
config: EngineConfig,
req: InteractRequest,
): Promise<InteractResponse> {
const annotated = await annotateClick(req.prevImageBase64, req.click);
const lastFrame = req.session.history.at(-1)?.frame;
const uiElements = lastFrame?.uiElements ?? [];
const intent = await interpret(config.vision, annotated, uiElements);
const updatedSession: Session = {
...req.session,
history: req.session.history.map((entry, idx, arr) =>
idx === arr.length - 1 ? { ...entry, click: req.click, intent } : entry,
),
};
const nextFrame = await direct(config.text, updatedSession);
const nextImage = await render(
config.image,
nextFrame,
updatedSession.styleGuide,
);
return {
session: updatedSession,
frame: nextFrame,
imageBase64: nextImage,
intent,
};
}
+115
View File
@@ -0,0 +1,115 @@
import type { Session, StoryFrame, UIElement } from "@dada/types";
export const DIRECTOR_SYSTEM = `你是一个交互视觉小说的编剧导演。每次根据世界观、画风和历史,输出当前画面要呈现的内容。
必须输出严格 JSON,结构如下:
{
"narration": "本帧旁白(可空字符串)",
"speaker": "本帧说话角色名(可空)",
"line": "本帧角色台词(可空)",
"scenePrompt": "英文场景描述,给图像模型用,描述画面里看到什么",
"uiElements": [
{ "id": "choice_1", "kind": "choice", "label": "选项一文字(≤15 字)" },
{ "id": "choice_2", "kind": "choice", "label": "选项二文字(≤15 字)" },
{ "id": "choice_3", "kind": "choice", "label": "选项三文字(≤15 字)" }
]
}
规则:
- narration / line 中文,scenePrompt 英文
- 默认 3 个 choice 元素,可以根据情境额外加 menu/item/custom(罕见)
- 选项必须能切实推进剧情,且互不重复
- scenePrompt 描述当前的画面,不要包括 UI 元素,UI 元素会另外渲染
- 单帧旁白与台词加起来控制在 80 字以内
- 不要输出 JSON 以外的任何文本`;
export function buildDirectorUserMessage(session: Session): string {
const parts: string[] = [];
parts.push(`世界观:${session.worldSetting}`);
parts.push(`画风:${session.styleGuide}`);
if (session.history.length === 0) {
parts.push("\n这是故事的开场。请生成开场画面。");
return parts.join("\n");
}
parts.push("\n历史:");
session.history.forEach((entry, idx) => {
const f = entry.frame;
const beat: string[] = [`【第 ${idx + 1} 帧】`];
if (f.narration) beat.push(`旁白:${f.narration}`);
if (f.line) beat.push(`${f.speaker ?? "?"}${f.line}`);
if (entry.intent) {
beat.push(
`用户行为:${entry.intent.targetLabel ?? entry.intent.freeformAction ?? "未知"}`,
);
}
parts.push(beat.join("\n"));
});
parts.push("\n请生成下一帧。");
return parts.join("\n");
}
export function buildImagePrompt(
frame: StoryFrame,
styleGuide: string,
): string {
const choiceList = frame.uiElements
.filter((e) => e.kind === "choice")
.map((e, i) => `${i + 1}. ${e.label}`)
.join("\n");
const extraUI = frame.uiElements
.filter((e) => e.kind !== "choice")
.map((e) => `- ${e.kind}: ${e.label}`)
.join("\n");
return `Generate a vertical 9:16 visual novel UI screen.
ART STYLE: ${styleGuide}
(Match this style consistently — for the scene art AND the UI elements.
For example: anime → traditional galgame dialogue box; cyberpunk → neon HUD;
stick figure → hand-drawn paper UI; cinematic realism → minimalist film overlay.)
SCENE (occupies the upper portion of the image):
${frame.scenePrompt}
DIALOGUE PANEL (semi-transparent, lower-middle area):
${frame.speaker ? `Speaker name displayed prominently: "${frame.speaker}"` : "Narration only — no speaker tag."}
${frame.line ? `Dialogue text: "${frame.line}"` : ""}
${frame.narration ? `Narration text (italic if speaker also present): "${frame.narration}"` : ""}
CHOICE PANEL (bottom area, three clearly tappable buttons stacked or arranged):
${choiceList}
${extraUI ? `\nADDITIONAL UI ELEMENTS:\n${extraUI}` : ""}
CRITICAL LAYOUT REQUIREMENTS:
- All text must be perfectly legible (high contrast, readable size)
- Choice buttons must be clearly distinguishable as interactive elements
- Choice text must NOT be cropped, NOT overlap with character faces
- The image is the entire interface — no external chrome will be added
- Choices appear in the order listed above`;
}
export const VISION_SYSTEM_PROMPT = `你是视觉理解助手。用户在视觉小说界面上点击了红色圆点位置,你要根据红点位置和图中可见的 UI 元素,判断用户的意图。
必须输出严格 JSON
{
"targetId": "对应的 UI 元素 idchoice_1 / choice_2 / choice_3 / menu / ...),如果点击的是非 UI 区域则为 null",
"targetLabel": "对应 UI 元素的文字描述(如 '告诉她真相'),未知则为 null",
"reasoning": "一句话说明判断理由",
"freeformAction": "如果用户点的是场景中的物件/角色等非选项区域,描述他可能的意图(如 '想拿起桌上的钥匙'),否则空字符串"
}
不要输出 JSON 以外的任何文本。`;
export function buildVisionUserPrompt(uiElements: UIElement[]): string {
const list = uiElements
.map((e) => `- id="${e.id}" kind="${e.kind}" label="${e.label}"`)
.join("\n");
return `当前画面包含以下已知 UI 元素:
${list}
红点位置即为用户点击位置。请判断用户的意图。`;
}
+12
View File
@@ -0,0 +1,12 @@
import { generateImage } from "@dada/ai-client";
import type { ProviderConfig, StoryFrame } from "@dada/types";
import { buildImagePrompt } from "./prompts";
export async function render(
config: ProviderConfig,
frame: StoryFrame,
styleGuide: string,
): Promise<string> {
const prompt = buildImagePrompt(frame, styleGuide);
return generateImage(config, prompt, { size: "1024x1536", quality: "medium" });
}
+26
View File
@@ -0,0 +1,26 @@
import { interpretClick } from "@dada/ai-client";
import type { ClickIntent, ProviderConfig, UIElement } from "@dada/types";
import { parseJsonLoose } from "./jsonParser";
import { VISION_SYSTEM_PROMPT, buildVisionUserPrompt } from "./prompts";
export async function interpret(
config: ProviderConfig,
annotatedImageBase64: string,
uiElements: UIElement[],
): Promise<ClickIntent> {
const userPrompt = `${VISION_SYSTEM_PROMPT}\n\n${buildVisionUserPrompt(uiElements)}`;
const raw = await interpretClick(config, annotatedImageBase64, userPrompt);
const parsed = parseJsonLoose<{
targetId?: string | null;
targetLabel?: string | null;
reasoning?: string;
freeformAction?: string;
}>(raw);
return {
targetId: parsed.targetId ?? null,
targetLabel: parsed.targetLabel ?? null,
reasoning: parsed.reasoning ?? "",
freeformAction: parsed.freeformAction || undefined,
};
}
+7
View File
@@ -0,0 +1,7 @@
{
"extends": "../../tsconfig.base.json",
"compilerOptions": {
"noEmit": true
},
"include": ["src/**/*"]
}
+14
View File
@@ -0,0 +1,14 @@
{
"name": "@dada/types",
"version": "0.1.0",
"private": true,
"type": "module",
"main": "./src/index.ts",
"types": "./src/index.ts",
"exports": {
".": "./src/index.ts"
},
"scripts": {
"typecheck": "tsc --noEmit"
}
}
+74
View File
@@ -0,0 +1,74 @@
export type UIElementKind = "choice" | "menu" | "item" | "custom";
export type UIElement = {
id: string;
kind: UIElementKind;
label: string;
hint?: string;
};
export type StoryFrame = {
id: string;
narration?: string;
speaker?: string;
line?: string;
scenePrompt: string;
uiElements: UIElement[];
};
export type ClickIntent = {
targetId: string | null;
targetLabel: string | null;
reasoning: string;
freeformAction?: string;
};
export type HistoryEntry = {
frame: StoryFrame;
click?: { x: number; y: number };
intent?: ClickIntent;
};
export type Session = {
id: string;
createdAt: number;
worldSetting: string;
styleGuide: string;
history: HistoryEntry[];
};
export type ProviderConfig = {
baseUrl: string;
apiKey: string;
model: string;
};
export type EngineConfig = {
text: ProviderConfig;
image: ProviderConfig;
vision: ProviderConfig;
};
export type StartRequest = {
worldSetting: string;
styleGuide: string;
};
export type StartResponse = {
sessionId: string;
frame: StoryFrame;
imageBase64: string;
};
export type InteractRequest = {
session: Session;
prevImageBase64: string;
click: { x: number; y: number };
};
export type InteractResponse = {
session: Session;
frame: StoryFrame;
imageBase64: string;
intent: ClickIntent;
};
+7
View File
@@ -0,0 +1,7 @@
{
"extends": "../../tsconfig.base.json",
"compilerOptions": {
"noEmit": true
},
"include": ["src/**/*"]
}
+3
View File
@@ -0,0 +1,3 @@
packages:
- "apps/*"
- "packages/*"
+17
View File
@@ -0,0 +1,17 @@
{
"compilerOptions": {
"target": "ES2022",
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"module": "ESNext",
"moduleResolution": "Bundler",
"esModuleInterop": true,
"strict": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"isolatedModules": true,
"noUncheckedIndexedAccess": true,
"verbatimModuleSyntax": false,
"jsx": "preserve"
}
}
+10
View File
@@ -0,0 +1,10 @@
{
"$schema": "https://openapi.vercel.sh/vercel.json",
"framework": "nextjs",
"buildCommand": "pnpm build",
"installCommand": "pnpm install",
"functions": {
"apps/web/app/api/interact/route.ts": { "maxDuration": 60 },
"apps/web/app/api/start/route.ts": { "maxDuration": 60 }
}
}