Skip to main content

Model-backed tools

Most MCP tools compute their own result and return it. A model-backed tool is different: you declare — in the tool's MCP metadata — which AI model should produce the result, and WizChat runs that model through the Vercel AI Gateway when the tool is called. The tool returns the model's output (text, or a generated image) instead of dispatching to your server.

This generalizes the in-dialog wizchat/llm completion: instead of the dialog calling a reserved tool with a prompt, any tool can be wired to a model server-side. You pick the model per tool — so one tool can be answered by google/gemini-3.1-flash-image while everything else uses your chatbot's default.

Who this is for

Developers building an MCP server declare the model in the tool's _meta. Chatbot owners flip one toggle to allow it (the same cost door as in-dialog completions). If your tool computes its own result, you don't need this.

When to use it

  • A tool whose job is "transform this input with an LLM" — summarize, rewrite, classify, extract — where you'd rather not run (or pay for, or key) a model on your server.
  • A tool that draws — render a diagram, chart, or illustration from structured input — by declaring an image model.

If the tool needs to call other tools or reason over multiple steps, that's the chat agent, not a model-backed tool. A model-backed tool is a single, one-shot model call.

The contract

Add a wizchat block to the tool's MCP _meta:

{
"name": "summarize_record",
"description": "Summarize the given record in two sentences.",
"inputSchema": { /* … */ },
"_meta": {
"wizchat": {
"model": "openai/gpt-4.1", // exact Vercel AI Gateway model id
"output": "text" // "text" (default) or "image"
}
}
}
  • model (required to make a tool model-backed) — the exact model id as written in the Vercel AI Gateway catalog, e.g. openai/gpt-4.1, anthropic/claude-sonnet-4.6, google/gemini-3.1-flash-image. (Look up the live id in the catalog — the examples here are illustrative.) The infrastructure-provider suffix is supported too: alibaba/qwen3-max@bedrock.
  • output (optional, default "text") — "text" runs a text model; "image" runs an image model.

WizChat validates the model when you add or refresh the server: the id must exist in the live gateway catalog and its type must match output (a "text" tool needs a language model; an "image" tool needs an image model). A mismatch is surfaced as a warning at add/refresh time, and an invalid id is rejected at call time with a clear, model-named error.

How the prompt is built

A model-backed tool has no prompt field of its own — WizChat assembles one from the tool call:

  • The tool's description is the standing instruction (used as the system prompt for text; prepended to the prompt for image, since image generation has no system role).
  • The input is the call arguments: a string prompt argument if you pass one, otherwise the JSON-serialized arguments.

So a dialog calls it through the normal bridge and gets the model's output back:

// Text tool — returns the model's text as a standard CallToolResult.
const result = await callTool('summarize_record', { record: someObject });
const text = (result?.content || []).find((c) => c.type === 'text')?.text ?? '';

Image output

Set output: "image" and declare an image model. The result is one or more MCP image content blocks ({ type: 'image', data: <base64>, mimeType }) — your dialog renders them however it likes (WizChat produces the image; displaying it is your dialog's job, same as any tool result).

{
"name": "render_diagram",
"description": "Render a clean schematic image from the supplied JSON spec.",
"_meta": { "wizchat": { "model": "google/gemini-3.1-flash-image", "output": "image" } }
}

Image-generation parameters are passed per call, inside a single reserved argument named wizchatImageOptions (all optional, validated + clamped server-side). They live under their own key — never as top-level arguments — so they can't collide with your tool's own domain arguments (a size that means "diameter", an n that means a count, …). WizChat strips wizchatImageOptions before building the prompt, so it never leaks into the model input. (On a text tool the field is simply ignored — it's stripped from the prompt and otherwise unused.)

wizchatImageOptions fieldMeaningNotes
size"{width}x{height}", e.g. "1024x1024"each side 64–4096; takes precedence over aspectRatio
aspectRatio"{w}:{h}", e.g. "16:9"used only when size is absent (some image models prefer this)
nnumber of imagesclamped to 1–4
seedinteger seed for reproducibilityoptional
const result = await callTool('render_diagram', {
prompt: 'a labeled flowchart of the steps',
wizchatImageOptions: { size: '1024x1024' },
});
const image = (result?.content || []).find((c) => c.type === 'image');
// image.data is base64; image.mimeType is e.g. "image/png"
Response size

The total image payload is capped (a few MB) to stay under the platform's response limit. A single ~1024² image is fine; if you request several large images and exceed the cap, the call is rejected cleanly rather than truncated. Prefer n: 1 and a modest size unless you need more.

Turning it on

Model-backed tools spend the chatbot owner's gateway budget, so they sit behind the same master switch as in-dialog completions:

  1. Open Chatbot → MCP Servers, then Edit the server.
  2. Enable "Allow this server's dialogs to run LLM completions" (allowLlmCompletion).
  3. Save.

Declaring _meta.wizchat.model is necessary but not sufficient — without the toggle, a model-backed tool is rejected (so a server can't quietly spend your tokens just by declaring a model).

Limits & security

  • Owner-gated. Runs only when the owner has enabled completions for that server; otherwise rejected.
  • Re-gated per call. Like every dialog action, each call re-verifies the signed-in user, their access to the server, and the live owner toggle — server-side.
  • Bounded spend. Shares the in-dialog completion budget: up to 5 model runs per opened dialog, plus a per-user/per-chatbot hourly rate limit. The tool input is capped at ~100 KB; text output at 2,000 tokens; image payloads are size-capped.
  • No prompt/output retention. WizChat records the call for cost/observability (gateway model + generation id) but does not retain the prompt, the completion text, or the image bytes in its tracing.
  • Gateway only, owner's account. The model runs through the AI Gateway on the owner's account — the tool can't pass its own API key, and cost is tracked like any other WizChat usage.
  • 30-second timeout. A hung model call fails cleanly rather than hanging the dialog.
The model can be wrong

Treat a model-backed result as a suggestion. Show it to the user and validate any structured output (e.g. with your server's own validation tool) before saving or acting on it.

Scope today

Model-backed tools run on the MCP-UI dialog path — i.e. when a tool is fired from a server-rendered dialog. A tool the chat agent chooses to call mid-conversation still dispatches to your server normally.