Model-backed tools

Most MCP tools compute their own result and return it. A model-backed tool is different: you declare — in the tool's MCP metadata — which AI model should produce the result, and WizChat runs that model through the Vercel AI Gateway when the tool is called. The tool returns the model's output (text, or a generated image) instead of dispatching to your server.

This generalizes the in-dialog wizchat/llm completion: instead of the dialog calling a reserved tool with a prompt, any tool can be wired to a model server-side. You pick the model per tool — so one tool can be answered by google/gemini-3.1-flash-image while everything else uses your chatbot's default.

Who this is for

Developers building an MCP server declare the model in the tool's _meta. Chatbot owners flip one toggle to allow it (the same cost door as in-dialog completions). If your tool computes its own result, you don't need this.

When to use it

A tool whose job is "transform this input with an LLM" — summarize, rewrite, classify, extract — where you'd rather not run (or pay for, or key) a model on your server.
A tool that draws — render a diagram, chart, or illustration from structured input — by declaring an image model.

If the tool needs to call other tools or reason over multiple steps, that's the chat agent, not a model-backed tool. A model-backed tool is a single, one-shot model call.

The contract

Add a wizchat block to the tool's MCP _meta:

{
  "name": "summarize_record",
  "description": "Summarize the given record in two sentences.",
  "inputSchema": { /* … */ },
  "_meta": {
    "wizchat": {
      "model": "openai/gpt-4.1",   // exact Vercel AI Gateway model id
      "output": "text"              // "text" (default) or "image"
    }
  }
}

model (required to make a tool model-backed) — the exact model id as written in the Vercel AI Gateway catalog, e.g. openai/gpt-4.1, anthropic/claude-sonnet-4.6, google/gemini-3.1-flash-image. (Look up the live id in the catalog — the examples here are illustrative.) The infrastructure-provider suffix is supported too: alibaba/qwen3-max@bedrock.
output (optional, default "text") — "text" runs a text model; "image" runs an image model.

WizChat validates the model when you add or refresh the server: the id must exist in the live gateway catalog and its type must match output (a "text" tool needs a language model; an "image" tool needs an image model). A mismatch is surfaced as a warning at add/refresh time, and an invalid id is rejected at call time with a clear, model-named error.

How the prompt is built

A model-backed tool has no prompt field of its own — WizChat assembles one from the tool call:

The tool's description is the standing instruction (used as the system prompt for text; prepended to the prompt for image, since image generation has no system role).
The input is the call arguments: a string prompt argument if you pass one, otherwise the JSON-serialized arguments.

So a dialog calls it through the normal bridge and gets the model's output back:

// Text tool — returns the model's text as a standard CallToolResult.
const result = await callTool('summarize_record', { record: someObject });
const text = (result?.content || []).find((c) => c.type === 'text')?.text ?? '';

Image output

Set output: "image" and declare an image model. The result is one or more MCP image content blocks ({ type: 'image', data: <base64>, mimeType }) — your dialog renders them however it likes (WizChat produces the image; displaying it is your dialog's job, same as any tool result).

{
  "name": "render_diagram",
  "description": "Render a clean schematic image from the supplied JSON spec.",
  "_meta": { "wizchat": { "model": "google/gemini-3.1-flash-image", "output": "image" } }
}

Image-generation parameters are passed per call, inside a single reserved argument named wizchatImageOptions (all optional, validated + clamped server-side). They live under their own key — never as top-level arguments — so they can't collide with your tool's own domain arguments (a size that means "diameter", an n that means a count, …). WizChat strips wizchatImageOptions before building the prompt, so it never leaks into the model input. (On a text tool the field is simply ignored — it's stripped from the prompt and otherwise unused.)

`wizchatImageOptions` field	Meaning	Notes
`size`	`"{width}x{height}"`, e.g. `"1024x1024"`	each side 64–4096; takes precedence over `aspectRatio`
`aspectRatio`	`"{w}:{h}"`, e.g. `"16:9"`	used only when `size` is absent (some image models prefer this)
`n`	number of images	clamped to 1–4
`seed`	integer seed for reproducibility	optional

const result = await callTool('render_diagram', {
  prompt: 'a labeled flowchart of the steps',
  wizchatImageOptions: { size: '1024x1024' },
});
const image = (result?.content || []).find((c) => c.type === 'image');
// image.data is base64; image.mimeType is e.g. "image/png"

Response size

The total image payload is capped (a few MB) to stay under the platform's response limit. A single ~1024² image is fine; if you request several large images and exceed the cap, the call is rejected cleanly rather than truncated. Prefer n: 1 and a modest size unless you need more.

Turning it on

Model-backed tools spend the chatbot owner's gateway budget, so they sit behind the same master switch as in-dialog completions:

Open Chatbot → MCP Servers, then Edit the server.
Enable "Allow this server's dialogs to run LLM completions" (allowLlmCompletion).
Save.

Declaring _meta.wizchat.model is necessary but not sufficient — without the toggle, a model-backed tool is rejected (so a server can't quietly spend your tokens just by declaring a model).

Limits & security

Owner-gated. Runs only when the owner has enabled completions for that server; otherwise rejected.
Re-gated per call. Like every dialog action, each call re-verifies the signed-in user, their access to the server, and the live owner toggle — server-side.
Bounded spend. Shares the in-dialog completion budget: up to 5 model runs per opened dialog, plus a per-user/per-chatbot hourly rate limit. The tool input is capped at ~100 KB; text output at 2,000 tokens; image payloads are size-capped.
No prompt/output retention. WizChat records the call for cost/observability (gateway model + generation id) but does not retain the prompt, the completion text, or the image bytes in its tracing.
Gateway only, owner's account. The model runs through the AI Gateway on the owner's account — the tool can't pass its own API key, and cost is tracked like any other WizChat usage.
30-second timeout. A hung model call fails cleanly rather than hanging the dialog.

The model can be wrong

Treat a model-backed result as a suggestion. Show it to the user and validate any structured output (e.g. with your server's own validation tool) before saving or acting on it.

Scope today

Model-backed tools run on the MCP-UI dialog path — i.e. when a tool is fired from a server-rendered dialog. A tool the chat agent chooses to call mid-conversation still dispatches to your server normally.

Interactive UI (MCP-UI) — render dialogs, and the in-dialog wizchat/llm completion this builds on.
Custom MCP servers — adding and configuring your own MCP servers.
AI Gateway & model ids — where the model strings come from.

When to use it​

The contract​

How the prompt is built​

Image output​

Turning it on​

Limits & security​

Scope today​

Related​