Run LLM-Generated Code Safely with Cloudflare codemode and Dynamic Workers
Cloudflare's Agents SDK v0.16.1, shipped June 16, 2026, includes code mode: a way to let a model write a single program and execute it inside a sandboxed Dynamic Worker instead of making one tool call at a time.

Introduction
Cloudflare's Agents SDK v0.16.1, shipped June 16, 2026, includes codemode: a way to let a model write a single program and execute it inside a sandboxed Dynamic Worker instead of making one tool call at a time. This tutorial builds an agent that hands the model a set of tools, lets it author code that orchestrates them, and runs that code in an isolate with zero ambient authority. The model gets to program. It does not get to touch anything you did not hand it.
The reason to care is not speed, though it is faster. It is that "the model wrote some code, now run it" is one of the most dangerous sentences in this field, and codemode is the part that makes it sayable3.
Why letting a model run code is dangerous
Hand a language model an eval and you have handed it your process. Whatever your worker can reach, the generated code can reach: your environment variables, your network, your filesystem, your credentials. The model does not need to be malicious for this to end badly. It needs to be confidently wrong once, in a program that has the same permissions you do.
The usual workaround is to never run model code at all and instead expose narrow tools the model calls one at a time. That is safer and slower, and it falls apart the moment a task needs a loop, a conditional, or twelve calls chained together.
What codemode and Dynamic Workers change
codemode runs the model's program in a Dynamic Worker, a fresh V8 isolate that starts in about 100 milliseconds and has no ambient authority. Nothing is reachable by default. The sandbox can only do what you explicitly grant through bindings you pass in. The model writes one program that calls your tools; the program runs isolated from the host; and the only powers it has are the ones you decided to lend it.
This is the capability model, and it inverts the danger. Instead of running code that can do anything and hoping it behaves, you run code that can do nothing and granting it exactly the few things the task needs.
The naive way, and where it breaks
A standard tool-calling agent makes the model emit one call, waits, emits the next, waits. For a task like "find every TypeScript file, read each one, count the TODOs," that is a round trip per file. The model spends most of its turn budget on plumbing.
// the slow path: one model round trip per step
const files = await model.callTool("find", { pattern: "**/*.ts" });
for (const file of files) {
const content = await model.callTool("read", { path: file });
// ...count TODOs, another round trip to report
}codemode lets the model write that whole loop as one program and run it once. The bridge sentence is short because the idea is: stop relaying, start executing, but execute somewhere safe.
Expose tools the model can program against
In a Think agent, getTools() returns the toolset. createExecuteTool wraps your tools so the model can call them from inside generated code, and runs that code through a loader-backed sandbox. The model now writes a program against find, read, and friends rather than calling them one by one.
// src/agent.ts
import { Think } from "@cloudflare/think";
import { createExecuteTool, createWorkspaceTools } from "@cloudflare/think";
interface Env {
AI: Ai;
LOADER: DynamicWorkerLoader;
}
export class CodeAgent extends Think<Env> {
getModel() {
// ... return your LanguageModel
}
getTools() {
return {
execute: createExecuteTool({
tools: createWorkspaceTools(this.workspace),
loader: this.env.LOADER,
}),
};
}
}The loader binding is what runs generated code in a Dynamic Worker. Without it, there is nowhere safe to execute, and the tool will not arm.
Grant the sandbox explicit capabilities
Zero ambient authority means the sandbox starts with nothing. You add powers by passing bindings into the runtime. For the durable, gated version, createCodemodeRuntime takes an executor (the Dynamic Worker) and a list of connectors, each of which is a single, named capability.
import { createCodemodeRuntime, DynamicWorkerExecutor } from "@cloudflare/codemode";
import { GithubConnector } from "@cloudflare/codemode/connectors";
const runtime = createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection),
],
});Read that connectors array as the complete list of things the generated program is allowed to touch. There is no GitHub access unless GithubConnector is in the list. There is no network call to anywhere else, ever, because you never lent it. (The default is no, and for once that is the safe default rather than the annoying one.)
Add approval gates for sensitive actions
v0.16.1 records codemode execution to durable logs, which means a run can pause at a sensitive action, wait for a human, and replay from the log after approval. You mark the connector action as gated; the runtime checkpoints, suspends, and resumes the exact program after you say yes.
const runtime = createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection, {
gate: ["create_pr"], // pause before this action, resume on approval
}),
],
});The program runs at full speed until it reaches create_pr, then stops cold and waits. Approve it and the durable log replays the run from where it paused, not from the top. The model never gets to merge anything while you are at lunch.
Model code in your worker vs in a Dynamic Worker
| Property | eval in your worker | codemode Dynamic Worker |
|---|---|---|
| Default access to env, network, secrets | full | none |
| How powers are granted | already there | explicit connectors only |
| Isolation from the host process | none | separate V8 isolate |
| Sensitive actions | run immediately | pause at approval gate |
| Cold start | n/a | about 100 ms |
| Blast radius of a wrong program | your whole worker | what you bound, nothing else |
What this does not do
The sandbox isolates execution; it does not make the model's judgment good. A program that has been granted the GitHub connector can still open a confidently wrong pull request, and codemode will run it perfectly. The capability model bounds what the code can reach, not whether reaching it was a good idea, and confusing those two is how teams talk themselves into granting too much. The honest framing: codemode shrinks the blast radius to exactly the set of powers you lent, so the only real safety question left is whether you lent too many.
The feature is also five days old as of writing. Treat the exact method names here as the shape of the API, check them against the current @cloudflare/codemode docs before you ship, and pin your version.
The full working agent
// src/agent.ts
import { Think } from "@cloudflare/think";
import { createExecuteTool, createWorkspaceTools } from "@cloudflare/think";
import { createCodemodeRuntime, DynamicWorkerExecutor } from "@cloudflare/codemode";
import { GithubConnector } from "@cloudflare/codemode/connectors";
interface Env {
AI: Ai;
LOADER: DynamicWorkerLoader;
GITHUB_TOKEN: string;
}
export class CodeAgent extends Think<Env> {
getModel() {
// return your LanguageModel instance
}
buildRuntime(connection: unknown) {
return createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection, {
gate: ["create_pr"],
}),
],
});
}
getTools() {
return {
execute: createExecuteTool({
tools: createWorkspaceTools(this.workspace),
loader: this.env.LOADER,
}),
};
}
async run(task: string, connection: unknown) {
const runtime = this.buildRuntime(connection);
const program = await this.generateProgram(task);
return runtime.run(program); // executes in a sandboxed Dynamic Worker
}
}Bind a LOADER for Dynamic Workers in wrangler.toml, deploy with wrangler deploy, and give the agent a task that needs more than one step. Watch it write one program, run it in the isolate, and stop at the gate before it touches your repository.
When to reach for it
Use codemode when the model needs to do real work with real tools and the cost of an ungated mistake is more than an apology: anything touching a repository, a payment, a production database. For a read-only agent that answers questions, the sandbox is weight you do not need, and a plain tool call is honest and enough. The line is the verb. The moment the model stops reading and starts doing, you want its hands in a box you packed yourself.
You might also like
Keep reading from the journal.
June 29, 2026AI
Build an Event-Driven Gemini API Pipeline with Webhooks Instead of Polling
On June 2026, Google added event-driven Webhooks to the Gemini API, so the Batch API and long-running operations can call your server when they finish instead of making you ask.
June 29, 2026Coding
Build a Cross-Modal Search Engine with Google gemini-embedding-2 in Python
In June 2026, Google added gemini-embedding-2 to the Gemini API, the first multimodal embedding model in the family.
June 23, 2026AI
The conversions added up to 250 percent
When every channel takes full credit