Skip to main content

Local Execution (Return-Script)

Most MCP tools execute server-side — WizChat calls your endpoint, your code runs, and the result comes back. Local execution (also called return-script mode) inverts that: WizChat never runs the tool. Instead it hands a script to a runtime on the user's machine, that runtime executes it locally, and WizChat synthesizes the final answer from the results.

Use this when the work has to happen on the user's device — driving locally-installed software (CAD/CAM automation via COM, desktop apps, hardware), touching the local filesystem, or anything that can't or shouldn't run in WizChat's cloud.

Who this is for

This page is for developers building their own MCP server + a local runtime to pair with it. The built-in servers and standard custom servers run server-side and need none of this.

How it works: the two legs

A return-script turn is split into two round-trips ("legs") with your local runtime in the middle.

sequenceDiagram
participant U as User
participant W as WizChat
participant R as Local Runtime
participant M as Your MCP Server

U->>W: Question
Note over W: Leg 1 — plan
W->>W: LLM picks tool + args
W->>M: resources/read (fetch script body)
M-->>W: script source
W-->>R: SSE: script_plan { steps[] }
Note over R: Execute locally
R->>R: run script (e.g. cscript file.vbs)
R->>W: POST executionResults[]
Note over W: Leg 2 — synthesize
W->>W: LLM writes answer from results
W-->>U: Final answer
  1. Leg 1 (plan). The model decides which of your tools to call and with what arguments. Instead of executing, WizChat fetches the matching script body from your server via MCP resources/read, then emits a script_plan event over the chat SSE stream and stops — no answer yet.
  2. Your runtime executes. A program you build on the user's machine receives the script_plan, runs each step locally, and collects the output.
  3. Leg 2 (synthesize). Your runtime POSTs the results back to the chat endpoint. WizChat runs a second LLM pass that turns the raw results into a natural-language answer and streams it to the user.

The two legs are billed and traced separately. Nothing in your script ever executes on WizChat infrastructure.

What your MCP server must provide

Your server is a normal MCP server with two specific obligations:

  1. Tools — declared the usual way. The model reads their schemas and picks one plus arguments. Tools intended for local execution should share a common name prefix (e.g. solidcam_) so the owner can target them with one config.
  2. Script bodies as MCP resources — the actual code to run, exposed via resources/read and addressable by a URI. WizChat resolves a URI template (below) to a concrete URI like myserver://recipes/session_launch, calls resources/read, and expects:
{
"contents": [
{ "uri": "myserver://recipes/session_launch", "mimeType": "text/vbs", "text": "<the script source>" }
]
}

Only the first content item's text is used. If the read fails or returns no text, that step's source is null (the turn still proceeds — your runtime decides how to handle a missing body).

Tool vs. resource

The tool is the addressable action the LLM selects (solidcam_session_launch). The resource is the code that action maps to (solidcam://recipes/session_launch). One tool → one script resource, linked by the URI template.

Owner configuration (localScriptFormat)

WizChat itself is server-agnostic — it hardcodes no URI schemes, languages, or invocation patterns. The chatbot owner wires your server up in the dashboard with two settings on the MCP server record:

  • executionMode: "return_script" — switches the server into local-execution mode.
  • localScriptFormat — tells WizChat how to map a tool to a fetchable resource, what language to label it, and what shell invocation (if any) to suggest.

localScriptFormat fields

FieldRequiredDescription
resourceUriTemplateTemplate for the resources/read URI. Example: myserver://recipes/{toolId}
languageLanguage label for the plan and code fences. Example: vbscript, powershell, python
filenameTemplateSuggested filename for the runtime to write. Example: {toolId}.vbs
toolNamePrefixOnly tools whose name starts with this are treated as local-execution tools. The prefix is also stripped to form {toolId}.
invocationTemplateShell command hint. Example: cscript //nologo {filename} {args}. Omit if your runtime knows how to run the file itself.
mimeTypeExpected MIME type. Falls back to whatever the resources/read response declares, then text/plain.
argsFormatHow the model's argument object is rendered into {args} (see below). Default cscript-named-slash.

Template placeholders

TokenValid inExpands to
{toolName}any templateFull tool name, e.g. solidcam_session_launch
{toolId}any templateTool name with toolNamePrefix stripped, e.g. session_launch
{filename}invocationTemplate onlyThe rendered filenameTemplate
{args}invocationTemplate onlyThe arguments rendered per argsFormat

Unknown tokens are left untouched.

argsFormat options

ValueRenders args as
cscript-named-slash (default)/key:value /key2:value2 (Windows cscript)
posix-double-dash--key value --key2 value2
posix-equals--key=value --key2=value2
space-separatedvalue1 value2 (positional)
env-varsKEY=value KEY2=value2
jsona single JSON-stringified blob
noneempty (your script reads stdin / env itself)

Example

{
"executionMode": "return_script",
"localScriptFormat": {
"toolNamePrefix": "solidcam_",
"resourceUriTemplate": "solidcam://recipes/{toolId}",
"language": "vbscript",
"mimeType": "text/vbs",
"filenameTemplate": "{toolId}.vbs",
"invocationTemplate": "cscript //nologo {filename} {args}",
"argsFormat": "cscript-named-slash"
}
}

With this config, a solidcam_session_launch call resolves to resource solidcam://recipes/session_launch, suggests filename session_launch.vbs, and an invocation of cscript //nologo session_launch.vbs /partPath:"C:\CAM Projects\bracket.prz". Argument values are quoted only when they contain whitespace — a value without spaces is passed unquoted (e.g. /partPath:C:\bracket.prz).

The runtime contract

This is the program you build and ship to the user's machine (an embedded WebView host, a desktop add-in, a CLI bridge — your choice). It has three jobs.

1. Receive the script_plan event

Open the chat SSE stream as a normal client. When a turn targets your server, you receive a script_plan event instead of a streamed answer:

{
"kind": "script_plan",
"version": 1,
"steps": [
{
"toolName": "solidcam_session_launch",
"args": { "partPath": "C:\\CAM Projects\\bracket.prz" },
"source": "<the script body, fetched from your MCP resource>",
"language": "vbscript",
"mimeType": "text/vbs",
"sourceUri": "solidcam://recipes/session_launch",
"filename": "session_launch.vbs",
"invocation": "cscript //nologo session_launch.vbs /partPath:\"C:\\CAM Projects\\bracket.prz\""
}
],
"question": "<the user's original question>",
"ragContext": "<optional retrieved context>",
"sources": [ /* optional citation sources */ ],
"responseLanguage": "en",
"leg1TraceId": "<opaque tracing id>"
}

Every per-step field except toolName and args may be null. If the format resolver can't fully resolve a tool — its name doesn't match toolNamePrefix, or resources/read returned no text — then source, language, mimeType, sourceUri, filename, and invocation each arrive as null. Handle them defensively rather than assuming they're present.

2. Execute each step locally

For each step: write source to filename, run it, and capture the result. Use invocation as a hint if present — but see the security note below for when it's absent.

3. POST results back

POST to the same chat endpoint with an executionResults array, echoing the plan's context fields back unchanged so Leg 2 can synthesize a grounded answer:

{
"executionResults": [
{
"stepIndex": 0,
"toolName": "solidcam_session_launch",
"success": true,
"data": { "openedPart": "C:\\CAM Projects\\bracket.prz", "operations": 12 },
"durationMs": 1840
}
],
// echo these back from the script_plan you received:
"question": "<original question>",
"ragContext": "<as received>",
"sources": [ /* as received */ ],
"responseLanguage": "en",
"leg1TraceId": "<as received>",
"mcpModelContext": { /* as received, if present */ }
}

WizChat synthesizes the final answer from executionResults and streams it to the user.

mcpModelContext (optional)

When the chatbot owner sets a per-server model override on a local-execution MCP server, the script_plan carries an mcpModelContext object so the Leg-2 answer synthesis uses the same model. Echo it back verbatim if present — WizChat re-validates it against the chatbot's current configuration server-side (it never trusts the echoed model), so the field is observability/scoping only and is safe to forward unchanged. It is omitted entirely when no override applies; just leave it out then.

Send a fresh harnessRunId, never a resume token

If your client tracks a harnessRunId (or resumeToken) for stream resumption, mint a fresh harnessRunId for the Leg-2 POST and do not echo the Leg-1 one. The Leg-1 value targets a resume gate that would replay the deferred plan frame and silently drop your executionResults — the whole callback would no-op. (WizChat also suppresses the resume token when executionResults are present, as a safety net, but the contract is: fresh id, no resume token.)

executionResults entry fields

FieldDescription
stepIndexIndex into the plan's steps[] this result is for
toolNameThe tool that ran
successtrue / false
dataFree-form JSON — the tool's output on success
errorError message string when success is false
durationMsLocal wall-clock duration
Forward unknown fields verbatim

Echo back every field you received in the script_plan, even ones you don't recognize — don't whitelist. WizChat adds fields over time (leg1TraceId links the two legs in tracing; more may follow). A runtime that drops unknown fields keeps working but silently loses those features. A runtime that forwards everything stays forward-compatible for free.

Security

WizChat treats your tool arguments as untrusted and guards against shell-injection before suggesting an invocation:

  • If any argument value contains a shell metacharacter (& | ; < > $ \ " % ^, control chars — plus ( ) 'for PowerShell/quoted templates), theinvocationfield is **omitted** from that step. Thesource, filename, args, and sourceUri` are still delivered.
  • Your runtime must not naively string-concatenate args into a shell command. When invocation is absent, build the command yourself using a safe argument-passing API (argument arrays, parameterized calls) — never a single concatenated shell string.
  • The arguments boundary is JSON-only. Pass values to your script as structured arguments or stdin, not by interpolating into a command line.

Treat a missing invocation as a signal that an argument needs careful handling, not as an error.

Checklist

To make your MCP server work with WizChat local execution:

  • Declare your tools with a shared name prefix.
  • Expose each tool's script body as an MCP resource readable via resources/read, returning { contents: [{ text, mimeType }] }.
  • Have the owner set executionMode: "return_script" + a localScriptFormat on the server in the dashboard.
  • Build a local runtime that: reads script_plan from the SSE stream → executes each step → POSTs executionResults back, echoing all plan fields.
  • Pass arguments safely; never shell-concatenate, especially when invocation is omitted.