Run LLM-Generated Code Safely with Cloudflare codemode and Dynamic Workers

June 29, 20267 min read12 sectionsBy Ahmed Abdullah

Introduction

Cloudflare's Agents SDK v0.16.1, shipped June 16, 2026, includes codemode: a way to let a model write a single program and execute it inside a sandboxed Dynamic Worker instead of making one tool call at a time. This tutorial builds an agent that hands the model a set of tools, lets it author code that orchestrates them, and runs that code in an isolate with zero ambient authority. The model gets to program. It does not get to touch anything you did not hand it.

The reason to care is not speed, though it is faster. It is that "the model wrote some code, now run it" is one of the most dangerous sentences in this field, and codemode is the part that makes it sayable3.

Why letting a model run code is dangerous

Hand a language model an eval and you have handed it your process. Whatever your worker can reach, the generated code can reach: your environment variables, your network, your filesystem, your credentials. The model does not need to be malicious for this to end badly. It needs to be confidently wrong once, in a program that has the same permissions you do.

The usual workaround is to never run model code at all and instead expose narrow tools the model calls one at a time. That is safer and slower, and it falls apart the moment a task needs a loop, a conditional, or twelve calls chained together.

What codemode and Dynamic Workers change

codemode runs the model's program in a Dynamic Worker, a fresh V8 isolate that starts in about 100 milliseconds and has no ambient authority. Nothing is reachable by default. The sandbox can only do what you explicitly grant through bindings you pass in. The model writes one program that calls your tools; the program runs isolated from the host; and the only powers it has are the ones you decided to lend it.

This is the capability model, and it inverts the danger. Instead of running code that can do anything and hoping it behaves, you run code that can do nothing and granting it exactly the few things the task needs.

The naive way, and where it breaks

A standard tool-calling agent makes the model emit one call, waits, emits the next, waits. For a task like "find every TypeScript file, read each one, count the TODOs," that is a round trip per file. The model spends most of its turn budget on plumbing.

javascript

// the slow path: one model round trip per step
const files = await model.callTool("find", { pattern: "**/*.ts" });
for (const file of files) {
const content = await model.callTool("read", { path: file });
// ...count TODOs, another round trip to report
}

codemode lets the model write that whole loop as one program and run it once. The bridge sentence is short because the idea is: stop relaying, start executing, but execute somewhere safe.

Expose tools the model can program against

In a Think agent, getTools() returns the toolset. createExecuteTool wraps your tools so the model can call them from inside generated code, and runs that code through a loader-backed sandbox. The model now writes a program against find, read, and friends rather than calling them one by one.

css

// src/agent.ts
import { Think } from "@cloudflare/think";
import { createExecuteTool, createWorkspaceTools } from "@cloudflare/think";
interface Env {
AI: Ai;
LOADER: DynamicWorkerLoader;
}
export class CodeAgent extends Think<Env> {
getModel() {
// ... return your LanguageModel
}
getTools() {
return {
execute: createExecuteTool({
tools: createWorkspaceTools(this.workspace),
loader: this.env.LOADER,
}),
};
}
}

The loader binding is what runs generated code in a Dynamic Worker. Without it, there is nowhere safe to execute, and the tool will not arm.

Grant the sandbox explicit capabilities

Zero ambient authority means the sandbox starts with nothing. You add powers by passing bindings into the runtime. For the durable, gated version, createCodemodeRuntime takes an executor (the Dynamic Worker) and a list of connectors, each of which is a single, named capability.

javascript

import { createCodemodeRuntime, DynamicWorkerExecutor } from "@cloudflare/codemode";
import { GithubConnector } from "@cloudflare/codemode/connectors";
const runtime = createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection),
],
});

Read that connectors array as the complete list of things the generated program is allowed to touch. There is no GitHub access unless GithubConnector is in the list. There is no network call to anywhere else, ever, because you never lent it. (The default is no, and for once that is the safe default rather than the annoying one.)

Run a model-authored program

With the runtime built, the model writes a program and the runtime executes it in the isolate. The generated code calls only the connectors you provided, and the host process is never in scope.

typescript

async handle(task: string) {
// the model produces a program string against the granted tools
const program = await this.generateProgram(task);
// executed in a Dynamic Worker, not in this worker
const result = await this.runtime.run(program);
return result;
}

The program might find files, loop over them, open a pull request through the GitHub connector, and return a summary, all in one execution. If it tries to read a secret you did not bind, the call is not denied at runtime so much as it never existed in the sandbox in the first place.

Add approval gates for sensitive actions

v0.16.1 records codemode execution to durable logs, which means a run can pause at a sensitive action, wait for a human, and replay from the log after approval. You mark the connector action as gated; the runtime checkpoints, suspends, and resumes the exact program after you say yes.

javascript

const runtime = createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection, {
gate: ["create_pr"], // pause before this action, resume on approval
}),
],
});

The program runs at full speed until it reaches create_pr, then stops cold and waits. Approve it and the durable log replays the run from where it paused, not from the top. The model never gets to merge anything while you are at lunch.

Model code in your worker vs in a Dynamic Worker

Property	eval in your worker	codemode Dynamic Worker
Default access to env, network, secrets	full	none
How powers are granted	already there	explicit connectors only
Isolation from the host process	none	separate V8 isolate
Sensitive actions	run immediately	pause at approval gate
Cold start	n/a	about 100 ms
Blast radius of a wrong program	your whole worker	what you bound, nothing else

What this does not do

The sandbox isolates execution; it does not make the model's judgment good. A program that has been granted the GitHub connector can still open a confidently wrong pull request, and codemode will run it perfectly. The capability model bounds what the code can reach, not whether reaching it was a good idea, and confusing those two is how teams talk themselves into granting too much. The honest framing: codemode shrinks the blast radius to exactly the set of powers you lent, so the only real safety question left is whether you lent too many.

The feature is also five days old as of writing. Treat the exact method names here as the shape of the API, check them against the current @cloudflare/codemode docs before you ship, and pin your version.

The full working agent

css

// src/agent.ts
import { Think } from "@cloudflare/think";
import { createExecuteTool, createWorkspaceTools } from "@cloudflare/think";
import { createCodemodeRuntime, DynamicWorkerExecutor } from "@cloudflare/codemode";
import { GithubConnector } from "@cloudflare/codemode/connectors";
interface Env {
AI: Ai;
LOADER: DynamicWorkerLoader;
GITHUB_TOKEN: string;
}
export class CodeAgent extends Think<Env> {
getModel() {
// return your LanguageModel instance
}
buildRuntime(connection: unknown) {
return createCodemodeRuntime({
ctx: this.ctx,
executor: new DynamicWorkerExecutor({ loader: this.env.LOADER }),
connectors: [
new GithubConnector(this.ctx, this.env, connection, {
gate: ["create_pr"],
}),
],
});
}
getTools() {
return {
execute: createExecuteTool({
tools: createWorkspaceTools(this.workspace),
loader: this.env.LOADER,
}),
};
}
async run(task: string, connection: unknown) {
const runtime = this.buildRuntime(connection);
const program = await this.generateProgram(task);
return runtime.run(program); // executes in a sandboxed Dynamic Worker
}
}

Bind a LOADER for Dynamic Workers in wrangler.toml, deploy with wrangler deploy, and give the agent a task that needs more than one step. Watch it write one program, run it in the isolate, and stop at the gate before it touches your repository.

When to reach for it

Use codemode when the model needs to do real work with real tools and the cost of an ungated mistake is more than an apology: anything touching a repository, a payment, a production database. For a read-only agent that answers questions, the sandbox is weight you do not need, and a plain tool call is honest and enough. The line is the verb. The moment the model stops reading and starts doing, you want its hands in a box you packed yourself.

Keep reading from the journal.

June 30, 2026

If the AI can't show you the page, it didn't read it

An answer you cannot trace is a rumor in a spreadsheet.

The brief improved until it cited a ghost

July 13, 2026

MLOps

The brief improved until it cited a ghost

A deterministic gate resolves every citation against the record

July 13, 2026

DocumentProcessing

The deal dies in the security review

Purpose-based access control makes minimum-necessary enforceable at query time